# LLM-Assisted Coding for Hydrology and GIS

## Introduction

In this notebook, we'll explore how to use AI-assisted coding to perform hydrological calculations and integrate GIS data. We'll start with basic hydrological functions and progressively move to more complex operations involving GIS data processing.

The code provided in the code cells was generated by ChatGPT and pasted into this notebook using the prompts provided throughout this notebook. To view the ChatGPT conversation where all the code was generated, see the link below. ChatGPT will probably not generate the code exactly the same way as I have currently in the code cells, but that is okay as long as the code returns the expected output. The prompts to use with ChatGPT are provided after the **Prompt:** and within the quotation marks. You should copy and paste these prompts into ChatGPT and ChatGPT should generate code very similar to the provided code in the code blocks. This is achieved by being very specific with the prompts, the more detail you leave out of a prompt, the more ChatGPT will have to infer, likely causing the code to deviate more from the provided code.

**ChatGPT Conversation:** https://chatgpt.com/share/b25a1933-5ad9-476a-96f9-b8f8a5d18c74

Once we begin working with outside data sources, you will need to update file paths in ChatGPT's generated code to wherever the provided data resides on your machine, we may also want to modify the example usage of the functions depending on how ChatGPT presents it's generated code, but the functions themselves shouldn't need any modification.

If you encounter an error running the ChatGPT generated code, try resending the prompt to ChatGPT by editing your message by clicking the pencil icon that pops up in ChatGPT when you hover over your message and simply hitting send again. I've tested the prompts used several times over and they have worked for me, but that's not a guarentee they will work 100% of the time. However, there is enough randomness in ChatGPT's responses that simply resending the same message can be enough to have it correct it's course. If after two times of re-running the same prompt still doesn't yield working code, copy and paste the error into ChatGPT and see if it can correct the code for the error.

Start by pip installing the libaries below if you do not already have them installed.

In [None]:
pip install pandas geopandas rasterio -q

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.7/21.7 MB[0m [31m37.4 MB/s[0m eta [36m0:00:00[0m
[?25h

## 1. Basic Hydrological Calculations

### 1.1 Rational Method Function

The Rational Method is a simple technique for estimating peak discharge from small watersheds. Let's use AI to help us create a function for this calculation.

When using AI for coding assistance, it's important to provide clear, specific prompts. Here's an example prompt we might use:

**Prompt:** "Write a Python function to calculate peak discharge using the Rational Method. The function should take runoff coefficient (C, unitless), rainfall intensity (i, inches per hour), and drainage area (A, acres) as inputs and return the peak discharge (Q, cfs)."

Now, let's see the function an AI might generate:

In [None]:
def calculate_peak_discharge(C, i, A):
    """
    Calculate the peak discharge using the Rational Method.

    Parameters:
    C (float): Runoff coefficient (unitless)
    i (float): Rainfall intensity (inches per hour)
    A (float): Drainage area (acres)

    Returns:
    float: Peak discharge (cfs)
    """
    # Rational Method formula
    Q = C * i * A

    return Q

# Example usage
C = 0.75  # Runoff coefficient
i = 2.5   # Rainfall intensity (inches/hour)
A = 10    # Drainage area (acres)

peak_discharge = calculate_peak_discharge(C, i, A)
print(f"Peak Discharge: {peak_discharge:.2f} cfs")

Peak Discharge: 18.75 cfs


**Exercise:** ChatGPT can not only write Python code, but execute the code as well using a tool called "Code Interpreter". Can you instruct ChatGPT to run the function on example data C = 0.60, i = 3.0 in/hr and A = 12 acres?

### 1.2 Using the Function in a Loop

Now that we have our basic function, let's use it to calculate peak discharge for multiple subbasins. We'll use a pandas dataframe to store subbasin data.

**Prompt:**
"Write a Python script that uses the rational method function you just created in a for loop to calculate peak discharge for multiple subbasins. Generate a pandas dataframe to store subbasin data including subbasin name, C (runoff coefficient), i (rainfall intensity in inches per hour), and A (drainage area in acres) values."

Based on this prompt, here's an example of the code an AI might generate:

In [None]:
import pandas as pd

# Define the function to calculate peak discharge
def calculate_peak_discharge(C, i, A):
    """
    Calculate the peak discharge using the Rational Method.

    Parameters:
    C (float): Runoff coefficient (unitless)
    i (float): Rainfall intensity (inches per hour)
    A (float): Drainage area (acres)

    Returns:
    float: Peak discharge (cfs)
    """
    Q = C * i * A
    return Q

# Subbasin data (name, C, i, A)
subbasin_data = [
    {"name": "Subbasin 1", "C": 0.60, "i": 3.0, "A": 12},
    {"name": "Subbasin 2", "C": 0.75, "i": 2.8, "A": 10},
    {"name": "Subbasin 3", "C": 0.50, "i": 3.2, "A": 15},
    {"name": "Subbasin 4", "C": 0.65, "i": 2.5, "A": 8},
    {"name": "Subbasin 5", "C": 0.70, "i": 3.1, "A": 9}
]

# Initialize an empty list to store results
results = []

# Loop through each subbasin and calculate the peak discharge
for subbasin in subbasin_data:
    name = subbasin["name"]
    C = subbasin["C"]
    i = subbasin["i"]
    A = subbasin["A"]

    # Calculate peak discharge
    Q = calculate_peak_discharge(C, i, A)

    # Store the result
    results.append({
        "Subbasin Name": name,
        "Runoff Coefficient (C)": C,
        "Rainfall Intensity (i) (in/hr)": i,
        "Drainage Area (A) (acres)": A,
        "Peak Discharge (Q) (cfs)": Q
    })

# Create a pandas dataframe from the results
df_subbasins = pd.DataFrame(results)

# Display the dataframe
print(df_subbasins)


  Subbasin Name  Runoff Coefficient (C)  Rainfall Intensity (i) (in/hr)  \
0    Subbasin 1                    0.60                             3.0   
1    Subbasin 2                    0.75                             2.8   
2    Subbasin 3                    0.50                             3.2   
3    Subbasin 4                    0.65                             2.5   
4    Subbasin 5                    0.70                             3.1   

   Drainage Area (A) (acres)  Peak Discharge (Q) (cfs)  
0                         12                     21.60  
1                         10                     21.00  
2                         15                     24.00  
3                          8                     13.00  
4                          9                     19.53  


This code demonstrates how we can efficiently calculate peak discharge for multiple subbasins using a loop.

**Exercise:** Can you instruct ChatGPT to run the function on example data (if it didn't do so already) and create a bar chart that shows the discharge per unit area for each subbasin? In practice, understanding the discharge per unit area allows you to spot any outliers which may require further investigation.

## 2 Incorporating GIS Data for Calculating Subbasin Parameters
When actually performing hydrologic analyses, we will typically be working with GIS data. We will have ChatGPT help us write code that will take in shapefiles and rasters and perform geospatial operations utilizing Geopandas and Rasterio to determine hydrologic parameters.

### 2.1 Subbasin Areas
We'll start by simply having ChatGPT write us code that will determine the easiest parameter, the subbasin area based on the provided Subbasins.shp shapefile.

**Prompt:** "Write a function that takes a polygon shapefile as an input, calculates the area of each polygon in acres. The function should return a pandas dataframe containing the original shapefile attributes and the calculated area. Assume the shapefile coordinates are in US Customary units."

In [None]:
import geopandas as gpd

def calculate_polygon_area_in_acres(shapefile_path):
    """
    Calculate the area of each polygon in a shapefile in acres.

    Parameters:
    shapefile_path (str): The path to the polygon shapefile.

    Returns:
    pd.DataFrame: A pandas DataFrame containing the original shapefile attributes and the calculated area in acres.
    """
    # Read the shapefile using geopandas
    gdf = gpd.read_file(shapefile_path)

    # Ensure the geometries are in projected coordinates (feet) if necessary
    # If the CRS is not already in feet (assumed), we may need to reproject (EPSG: 2249 for feet-based CRS, for example)

    # Calculate the area in square feet and convert to acres
    gdf['Area (acres)'] = gdf.geometry.area / 43560  # Convert square feet to acres

    # Convert the geopandas GeoDataFrame to a pandas DataFrame
    df = pd.DataFrame(gdf.drop(columns='geometry'))

    return df

# Example usage
shapefile_path = "/content/Subbasins.shp"
result_df = calculate_polygon_area_in_acres(shapefile_path)
print(result_df)



    name       area  Area (acres)
0  SB010  59.410480     59.410480
1  SB020  64.968865     64.968865


**Exercise:** Can you have ChatGPT explain how a geopandas dataframe is different than a pandas dataframe?

### 2.2 Integrating GIS Data for Area-Weighted Average Runoff Coefficients

The runoff coefficient (C) is determined based on the types of land uses within the subbasin. There is a "Land_Use.shp" shapefile provided with the land uses delineated and C values assigned to each land use in the attribute table of the shapefile in the field named "C_Value". The runoff coefficient for each subbasin is determined by calculating the area-weighted average C value within each subbasin. We will have ChatGPT write us a function that will perform these area-weighted C value calculations based on the intersection of the land use and subbasins data.

To test the function that ChatGPT creates you will need to pass in the function arguments for the subbasin shapefile file path, land use shapefile file path, the attribute table field for the C values ("C_Value" in this case) and the attribute table field name for the subbasin IDs ("name" in this caes).

**Prompt:** "Write me a function that will take two polygon shapefiles as input, one representing subbasins and the other representing land use, as well as a attribute table field name for where the land use runoff coefficient (C) is stored and an attribute table field name where the subbasin IDs are stored. The function should calculate the area weighted average runoff coefficient (C) for each subbasin in the subbasins shapefile. The function should return a pandas dataframe containing the original subbasin shapefile attributes and the calculated area-weighted average runoff coefficients. Assume the coordinates for both shapefiles are in US customary units."

In [None]:
def calculate_area_weighted_runoff(subbasins_shapefile, landuse_shapefile, landuse_C_field, subbasin_id_field):
    """
    Calculate the area-weighted average runoff coefficient (C) for each subbasin.

    Parameters:
    subbasins_shapefile (str): The path to the subbasin polygon shapefile.
    landuse_shapefile (str): The path to the land use polygon shapefile.
    landuse_C_field (str): The field name where the land use runoff coefficient (C) is stored in the land use shapefile.
    subbasin_id_field (str): The field name where the subbasin IDs are stored in the subbasin shapefile.

    Returns:
    pd.DataFrame: A pandas DataFrame containing the original subbasin attributes and the calculated area-weighted average runoff coefficients (C).
    """
    # Step 1: Load the subbasins and land use shapefiles using GeoPandas
    subbasins_gdf = gpd.read_file(subbasins_shapefile)
    landuse_gdf = gpd.read_file(landuse_shapefile)

    # Step 2: Overlay (spatial join) the land use polygons with the subbasin polygons
    # This performs an intersection to calculate the overlapping areas of land use in each subbasin
    overlay_gdf = gpd.overlay(landuse_gdf, subbasins_gdf, how='intersection')

    # Step 3: Calculate the area of each intersected polygon (in square feet)
    overlay_gdf['Area_sqft'] = overlay_gdf.geometry.area

    # Step 4: Calculate the weighted contribution of each land use polygon to the total subbasin
    # Multiply the land use area by its runoff coefficient (C)
    overlay_gdf['Weighted_C'] = overlay_gdf[landuse_C_field] * overlay_gdf['Area_sqft']

    # Step 5: Group by subbasin to calculate the area-weighted average runoff coefficient for each subbasin
    grouped = overlay_gdf.groupby(subbasin_id_field).agg(
        total_area=('Area_sqft', 'sum'),
        total_weighted_C=('Weighted_C', 'sum')
    )

    # Calculate the area-weighted average runoff coefficient for each subbasin
    grouped['Area_Weighted_C'] = grouped['total_weighted_C'] / grouped['total_area']

    # Step 6: Merge the calculated values back with the original subbasin attributes
    result_gdf = subbasins_gdf.merge(grouped[['Area_Weighted_C']], left_on=subbasin_id_field, right_index=True)

    # Step 7: Convert the result to a pandas DataFrame (excluding the geometry)
    result_df = pd.DataFrame(result_gdf.drop(columns='geometry'))

    return result_df

# Example usage:
subbasins_shapefile = "/content/Subbasins.shp"
landuse_shapefile = "/content/Land_Use.shp"
landuse_C_field = "C_Value"
subbasin_id_field = "name"
result_df = calculate_area_weighted_runoff(subbasins_shapefile, landuse_shapefile, landuse_C_field, subbasin_id_field)
print(result_df)


    name       area  Area_Weighted_C
0  SB010  59.410480         0.539037
1  SB020  64.968865         0.621546


### 2.3 Integrating GIS Data for Time of Concentration

Now, let's integrate GIS data to calculate the time of concentration using the Kirpich formula. This time of concentration using the Kirpich method depends on the length and slope of the longest flow path within a subbasin.  We'll use the provided flow path line shapefile and the raster digital elevation model (DEM) and have ChatGPT write functions to extract elevations from the DEM, calculate the length and slope of each flow path, then finally calculate the time of concentration for each subbasin. The function will require as inputs the flow path shapefile, DEM, and the attribute table field name that contains the ID for each flow path (just the string "name" in our case).

In the prompt you'll notice that I refer to the flow paths shapefile as a "MultiLineString". This helps ChatGPT understand the type of data that the function will be working with. A MultiLineString shapefile is a specific type of shapefile geometry that represents multiple lines (or polylines) as a single feature. These individual lines, known as "LineStrings," are grouped together to form a single multi-part feature. This differs from a standard Line shapefile, which typically represents each line as a separate feature.

**Prompt** "Write a function that takes in a MultiLineString shapefile representing the subbasin flow paths, a raster DEM and the flow path ID attribute table field name as input. Extract the DEM elevations at the start and end of each feature in the MultiLineString shapefile, calculate the length of each feature (in feet), then use this data to calculate the slope (feet/feet) of each feature. Use the determined length and slope to calculate the time of concentration each feature using the Kirpich method. The Kirpich time of concentration formula is shown below:

Tc = 0.0078(L^0.77)(S^-0.385)

where: Tc = time of concentration (minutes) L = length (feet) S = slope (feet/feet)

The function should return a dataframe with all the original shapefile attributes and the calculated lengths, start and end elevations, slopes and time of concentration (minutes). Assume the coordinates of the line shapefile are in US Customary units."

In [None]:
import rasterio
from shapely.geometry import LineString
import numpy as np

def calculate_time_of_concentration(flowpaths_shapefile, dem_raster, flowpath_id_field):
    """
    Calculate the length, start and end elevations, slope, and time of concentration (Tc) for each flow path
    in a MultiLineString shapefile using the Kirpich method.

    Parameters:
    flowpaths_shapefile (str): Path to the MultiLineString shapefile representing the subbasin flow paths.
    dem_raster (str): Path to the raster DEM file.
    flowpath_id_field (str): The field name where the flow path IDs are stored.

    Returns:
    pd.DataFrame: A pandas DataFrame containing the original shapefile attributes and the calculated lengths,
                  start and end elevations, slopes, and time of concentration (Tc).
    """
    # Step 1: Load the flowpaths shapefile using GeoPandas
    flowpaths_gdf = gpd.read_file(flowpaths_shapefile)

    # Step 2: Open the DEM raster using Rasterio
    with rasterio.open(dem_raster) as dem:
        # Step 3: Initialize an empty list to store results
        results = []

        # Step 4: Iterate over each flowpath in the shapefile
        for idx, row in flowpaths_gdf.iterrows():
            flowpath = row.geometry

            # Check if the geometry is a LineString or MultiLineString and extract the coordinates
            if isinstance(flowpath, LineString):
                coords = list(flowpath.coords)
            else:
                # If MultiLineString, use the first component
                coords = list(flowpath[0].coords)

            # Extract the start and end coordinates
            start_coords = coords[0]
            end_coords = coords[-1]

            # Step 5: Get the elevation at the start and end points from the DEM
            start_elev = list(dem.sample([start_coords]))[0][0]
            end_elev = list(dem.sample([end_coords]))[0][0]

            # Step 6: Calculate the length of the flow path (in feet)
            length = flowpath.length

            # Step 7: Calculate the slope (S) in feet/feet
            slope = (start_elev - end_elev) / length

            # Step 8: Calculate the time of concentration (Tc) using the Kirpich method
            if slope > 0:  # Ensure the slope is positive
                Tc = 0.0078 * (length ** 0.77) * (slope ** -0.385)
            else:
                Tc = np.nan  # Handle case where slope is negative or zero (invalid case for the Kirpich formula)

            # Step 9: Store the results
            results.append({
                flowpath_id_field: row[flowpath_id_field],
                'Length (ft)': length,
                'Start Elevation (ft)': start_elev,
                'End Elevation (ft)': end_elev,
                'Slope (ft/ft)': slope,
                'Time of Concentration (min)': Tc
            })

    # Step 10: Convert the results to a pandas DataFrame
    results_df = pd.DataFrame(results)

    # Step 11: Merge the original flowpaths shapefile attributes with the calculated results
    final_df = pd.merge(flowpaths_gdf.drop(columns='geometry'), results_df, on=flowpath_id_field)

    return final_df

# Example usage:
# flowpaths_shapefile = "path_to_flowpaths.shp"
# dem_raster = "path_to_dem.tif"
# flowpath_id_field = "flowpath_id"
# result_df = calculate_time_of_concentration(flowpaths_shapefile, dem_raster, flowpath_id_field)
# print(result_df)


# Example usage:
flow_paths_shapefile = '/content/Flow_Paths.shp'
dem_raster = '/content/DEM_NAD83.tif'
flow_path_id_field = "name"  # Example flow path ID field name

result_df = calculate_time_of_concentration(flow_paths_shapefile, dem_raster, flow_path_id_field)
print(result_df)


    name  Length (ft)  Start Elevation (ft)  End Elevation (ft)  \
0  SB010  3023.137260           1536.305908         1471.551514   
1  SB020  2798.769445           1542.776123         1475.628418   

   Slope (ft/ft)  Time of Concentration (min)  
0       0.021420                    16.393994  
1       0.023992                    14.788858  


**Exercise:** Can you get ChatGPT to explain the code that extracts the raster values at the line start and end points step by step?

### 2.4 Incorporating NOAA Atlas 14 Data

In real-world scenarios, we often need to work with rainfall data from sources like NOAA Atlas 14. The time of concentration (Tc) of a watershed is crucial for determining the appropriate rainfall intensity to use in our calculations. NOAA Atlas 14 provides rainfall depths for various durations, but we typically need to interpolate to find the intensity for our specific Tc. Let's create a function to interpolate rainfall intensity for a given time of concentration using NOAA Atlas 14 data.

**Prompt** "Write a Python function that interpolates rainfall intensity from NOAA Atlas 14 data for a given time of concentration. The function should:

1. Take a single parameter 'tc' (time of concentration in minutes).
2. Use a predefined csv file containing the NOAA Atlas 14 data where the csv is formatted as first column is durations in minutes and second column is rainfall depths in inches.
3. Return the interpolated rainfall depth in inches and rainfall intensity in inches/hour"

In [None]:
from scipy.interpolate import interp1d

def interpolate_rainfall_intensity(tc, csv_file):
    """
    Interpolates the rainfall depth and calculates rainfall intensity for a given time of concentration (tc)
    using NOAA Atlas 14 data from a CSV file.

    Parameters:
    tc (float): Time of concentration in minutes.
    csv_file (str): Path to the CSV file containing NOAA Atlas 14 data with durations (minutes) and rainfall depths (inches).

    Returns:
    tuple: Interpolated rainfall depth in inches and rainfall intensity in inches per hour.
    """
    # Step 1: Load NOAA Atlas 14 data from the CSV file
    df = pd.read_csv(csv_file)

    # Ensure that the CSV has the correct columns: first column as durations (minutes), second as rainfall depths (inches)
    durations = df.iloc[:, 0].values  # Durations in minutes
    rainfall_depths = df.iloc[:, 1].values  # Rainfall depths in inches

    # Step 2: Use interpolation to estimate the rainfall depth for the given time of concentration (tc)
    # We use 'linear' interpolation, which works well for evenly spaced data like this.
    interpolate_function = interp1d(durations, rainfall_depths, kind='linear', fill_value='extrapolate')

    # Interpolate the rainfall depth for the given tc (time of concentration)
    interpolated_depth = interpolate_function(tc)

    # Step 3: Calculate the rainfall intensity (inches/hour)
    # Rainfall intensity is calculated as depth (inches) divided by duration (hours)
    rainfall_intensity = interpolated_depth / (tc / 60)  # Convert tc from minutes to hours

    return interpolated_depth, rainfall_intensity

# Example usage:
csv_file = '/content/NOAA_Rainfall_Data.csv'  # Replace with the actual path to your CSV file
tc = 45  # Example time of concentration in minutes
depth, intensity = interpolate_rainfall_intensity(tc, csv_file)
print(f"Interpolated Rainfall Depth: {depth:.3f} inches")
print(f"Rainfall Intensity: {intensity:.3f} inches/hour")



Interpolated Rainfall Depth: 2.015 inches
Rainfall Intensity: 2.687 inches/hour


This function demonstrates linear interpolation, a common technique in hydrology for estimating values between known data points. The NOAA Atlas 14 data provides rainfall depths for specific durations, and we often need to estimate intensities for times of concentration that fall between these durations.

**Exercise:** Can you upload the provided "NOAA_Rainfall_Data.csv" to ChatGPT and instruct ChatGPT to create a plot of rainfall depth versus duration? Can you instruct ChatGPT to modify the plot by adding axes labels, grid lines, title or changing the line color?


# 3 Putting It All Together
Now lets have ChatGPT create one final function that utilizes all of the functions that we've created so far that take in all the shapefile, raster and csv data and return a dataframe with all the calculated subbasin parameters and peak discharges.

**Prompt:** "Create a Python function that utilizes all the functions we've developed in this conversation to:

Calculate subbasin area from the subbasins shapefile
Calculate the area-weighted average runoff coefficients for each subbasin
Calculate the time of concentration for each subbasin
Calculate the rainfall intensity for each subbasin
Calculate the peak discharge for each subbasin
The function should return a dataframe with all the calculated subbasin parameters including the subbasin ID, area, area-weighted runoff coefficient, time of concentration, rainfall intensity and peak discharge.

Do not rewrite the functions, simply just use the functions that you've already created."



In [None]:
import pandas as pd

def calculate_subbasin_parameters(subbasins_shapefile, landuse_shapefile, flowpaths_shapefile, dem_raster,
                                  flowpath_id_field, landuse_C_field, subbasin_id_field, csv_file_noaa):
    """
    Calculate subbasin parameters including area, area-weighted runoff coefficient, time of concentration,
    rainfall intensity, and peak discharge for each subbasin.

    Parameters:
    subbasins_shapefile (str): Path to the subbasin polygon shapefile.
    landuse_shapefile (str): Path to the land use polygon shapefile.
    flowpaths_shapefile (str): Path to the flow paths (MultiLineString) shapefile.
    dem_raster (str): Path to the DEM raster file.
    flowpath_id_field (str): The field name where flow path IDs are stored in the flowpath shapefile.
    landuse_C_field (str): The field name where the land use runoff coefficient (C) is stored in the land use shapefile.
    subbasin_id_field (str): The field name where subbasin IDs are stored in the subbasin shapefile.
    csv_file_noaa (str): Path to the CSV file containing NOAA Atlas 14 data (durations and rainfall depths).

    Returns:
    pd.DataFrame: A pandas DataFrame containing the subbasin parameters including area, area-weighted C,
                  time of concentration, rainfall intensity, and peak discharge.
    """
    # 1. Calculate subbasin area
    subbasin_area_df = calculate_polygon_area_in_acres(subbasins_shapefile)

    # 2. Calculate the area-weighted average runoff coefficients for each subbasin
    runoff_coefficient_df = calculate_area_weighted_runoff(subbasins_shapefile, landuse_shapefile,
                                                           landuse_C_field, subbasin_id_field)

    # Merge the area and area-weighted C dataframes on the subbasin ID
    subbasin_df = pd.merge(subbasin_area_df, runoff_coefficient_df, on=subbasin_id_field)

    # 3. Calculate the time of concentration for each subbasin
    time_of_concentration_df = calculate_time_of_concentration(flowpaths_shapefile, dem_raster, flowpath_id_field)

    # Merge with the existing dataframe
    subbasin_df = pd.merge(subbasin_df, time_of_concentration_df, on=subbasin_id_field)

    # 4. Calculate the rainfall intensity for each subbasin using time of concentration
    subbasin_df['Rainfall Depth (inches)'] = subbasin_df['Time of Concentration (min)'].apply(
        lambda tc: interpolate_rainfall_intensity(tc, csv_file_noaa)[0])

    subbasin_df['Rainfall Intensity (in/hr)'] = subbasin_df['Time of Concentration (min)'].apply(
        lambda tc: interpolate_rainfall_intensity(tc, csv_file_noaa)[1])

    # 5. Calculate the peak discharge for each subbasin using the Rational Method
    subbasin_df['Peak Discharge (cfs)'] = subbasin_df.apply(
        lambda row: calculate_peak_discharge(row['Area_Weighted_C'], row['Rainfall Intensity (in/hr)'], row['Area (acres)']),
        axis=1)

    # Return the final dataframe with all calculated subbasin parameters
    return subbasin_df

# Define the file paths and field names as variables
subbasins_shapefile = "/content/Subbasins.shp"  # Path to your subbasins shapefile
landuse_shapefile = "/content/Land_Use.shp"  # Path to your land use shapefile
flowpaths_shapefile = "/content/Flow_Paths.shp"  # Path to your flow paths shapefile
dem_raster = "/content/DEM_NAD83.tif"  # Path to your DEM raster file
flowpath_id_field = "name"  # Field name in flowpaths shapefile for the flow path IDs
landuse_C_field = "C_Value"  # Field name in land use shapefile for the runoff coefficient
subbasin_id_field = "name"  # Field name in subbasins shapefile for the subbasin IDs
csv_file_noaa = "/content/NOAA_Rainfall_Data.csv"  # Path to your NOAA Atlas 14 CSV file

# Call the function with the specified variables
result_df = calculate_subbasin_parameters(subbasins_shapefile, landuse_shapefile, flowpaths_shapefile, dem_raster,
                                          flowpath_id_field, landuse_C_field, subbasin_id_field, csv_file_noaa)

# Print or display the resulting DataFrame with subbasin parameters
print(result_df)


    name     area_x  Area (acres)     area_y  Area_Weighted_C  Length (ft)  \
0  SB010  59.410480     59.410480  59.410480         0.539037  3023.137260   
1  SB020  64.968865     64.968865  64.968865         0.621546  2798.769445   

   Start Elevation (ft)  End Elevation (ft)  Slope (ft/ft)  \
0           1536.305908         1471.551514       0.021420   
1           1542.776123         1475.628418       0.023992   

   Time of Concentration (min)  Rainfall Depth (inches)  \
0                    16.393994                 1.382749   
1                    14.788858                 1.329021   

   Rainfall Intensity (in/hr)  Peak Discharge (cfs)  
0                    5.060692            162.065918  
1                    5.391981            217.734421  
