<a href="https://colab.research.google.com/github/carlibeisel/mason_water_budget/blob/main/reach_compile_data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Compile data for Mason Water Budget**

By: Carli Beisel

Created: August 2024

**Variables:**

*qin* = streamflow at upstream gauge on reach

*qout* = streamflow at downstream gauge on reach

*sr* = surface runoff from agricultural irrigation

*gw* = groundwater

*evap* = potential open water evaporation

In [34]:
## --------------- ##
## IMPORT PACKAGES ##
## --------------- ##

import pandas as pd
import glob
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Reach Water Budget Calculations & Data

In [None]:
## --------------- ##
##  LOAD DATASETS  ##
## --------------- ##

# Streamflow
drainflow = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/drainflow_out/final_monthly_flow.csv')
drainflow['Month'] = pd.to_datetime(drainflow['Month'], format='%Y-%m')
drainflow['Reach'] = drainflow['Reach'].replace('Noble Drain', 'Noble Reach')

# Precipitation
precip = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/precip_out/reach_precip_monthly.csv')
precip['Reach'] = precip['Reach'].replace('Noble Drain', 'Noble Reach')

# Evaporation (PET)
pet = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/pet_out/reach_pet.csv')
pet['Month'] = pd.to_datetime(pet['Month'].astype(str) + '-2024', format='%m-%Y', errors='coerce')
pet['Reach'] = pet['Reach'].replace('Noble Drain', 'Noble Reach')

# Surface Runoff
surface_runoff = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/irrigation_runoff_out/reach_runoff_final.csv')
surface_runoff.rename(columns={'runoff(af)': 'SR (AF)'}, inplace=True)
surface_runoff['Reach'] = surface_runoff['Reach'].replace('Noble Drain', 'Noble Reach')


# Correct month formatting for surface runoff
month_mapping = {
    "jan": "01", "feb": "02", "mar": "03", "apr": "04",
    "may": "05", "jun": "06", "jul": "07", "aug": "08",
    "sep": "09", "oct": "10", "nov": "11", "dec": "12"
}
surface_runoff['Month'] = surface_runoff['month'].str.lower().map(month_mapping)
surface_runoff['Month'] = pd.to_datetime("2024-" + surface_runoff['Month'], format='%Y-%m', errors='coerce')

# TVGWFM groundwater
gw = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/groundwater_out/gw_final_kaf.csv')
gw['Reach'] = gw['Reach'].replace('Noble Drain', 'Noble Reach')
gw = gw[gw['Year'] == 2015]
gw.drop(columns=['Stress Period', 'Year'], inplace=True)
days_in_month = {
    1: 31, 2: 28, 3: 31, 4: 30, 5: 31, 6: 30,
    7: 31, 8: 31, 9: 30, 10: 31, 11: 30, 12: 31
}
gw['Days in Month'] = gw['Month'].map(days_in_month)
gw['GW (AF)'] = (gw['Flow (cfperday)'] * gw['Days in Month']) / 43560
gw['Month'] = gw['Month'].apply(lambda x: f'2024-{int(x):02d}-01')
gw['Month'] = pd.to_datetime(gw['Month'], format='%Y-%m-%d')
gw = gw[['Reach', 'Month', 'GW (AF)']]


In [None]:
## ------------------------------ ##
##   Merge into one .csv file (KAF)
## ------------------------------ ##

drainflow_selected = drainflow[['Reach', 'Month', 'Qin', 'Qout']]

pet_selected = pet[['Reach', 'Month', 'Reach Evap (AF)']]

precip_selected = precip[['Reach', 'Month', 'Total Monthly Precip (AF)']]
precip_selected['Month'] = pd.to_datetime(precip_selected['Month'], errors='coerce')

surface_runoff_selected = surface_runoff[['Reach', 'Month', 'SR (AF)']]
surface_runoff_selected['Reach'] = surface_runoff['Reach'].replace('Noble Drain', 'Noble Reach')

# Merge the DataFrames on "Reach" and "Month"
merged_df = drainflow_selected.merge(pet_selected, on=['Reach', 'Month'], how='outer')
merged_df = merged_df.merge(precip_selected, on=['Reach', 'Month'], how='outer')
merged_df = merged_df.merge(surface_runoff_selected, on=['Reach', 'Month'], how='outer')
merged_df = merged_df.merge(gw, on=['Reach', 'Month'], how='outer')

# Display the final merged DataFrame
merged_df.to_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/compiled_data/final_water_budget.csv')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  precip_selected['Month'] = pd.to_datetime(precip_selected['Month'], errors='coerce')
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  surface_runoff_selected['Reach'] = surface_runoff['Reach'].replace('Noble Drain', 'Noble Reach')


In [None]:
# ---------------------------- #
# Calculate Water Budget
# --------------------------- #

# with Runoff

def calculate_wb_residual(df):
    """
    Calculate the water budget residual (Qout - Qin) in acre-feet (AF) for each Gauge at each Month.

    Parameters:
    df (pd.DataFrame): DataFrame containing 'Gauge', 'Month', 'Qin', and 'Qout' columns.

    Returns:
    pd.DataFrame: DataFrame with an additional 'Flow Difference (AF)' column.
    """
    # Ensure 'Qin' and 'Qout' are numeric
    df['Qin (AF)'] = pd.to_numeric(df['Qin'], errors='coerce')
    df['Qout (AF)'] = pd.to_numeric(df['Qout'], errors='coerce')
    df['Precip (AF)'] = pd.to_numeric(df['Total Monthly Precip (AF)'], errors='coerce')
    df['Evap (AF)'] = pd.to_numeric(df['Reach Evap (AF)'], errors='coerce')
    df['SR (AF)'] = pd.to_numeric(df['SR (AF)'], errors='coerce')

    # Calculate the flow difference ( Residual = output - input )
    df['Residual (AF)'] = (df['Qout (AF)'] + df['Evap (AF)']) - (df['Qin (AF)'] + df['Precip (AF)']+ df["SR (AF)"])

    return df

final_df = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/compiled_data/final_water_budget.csv')
residual = calculate_wb_residual(final_df)
residual = residual.drop_duplicates()
residual.to_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/compiled_data/final_water_budget_af.csv', index = False)

In [None]:
# ---------------------------- #
# Calculate Water Budget
# --------------------------- #

# NO RUNOFF

def calculate_wb_residual(df):
    """
    Calculate the water budget residual (Qout - Qin) in acre-feet (AF) for each Gauge at each Month.

    Parameters:
    df (pd.DataFrame): DataFrame containing 'Gauge', 'Month', 'Qin', and 'Qout' columns.

    Returns:
    pd.DataFrame: DataFrame with an additional 'Flow Difference (AF)' column.
    """
    # Ensure 'Qin' and 'Qout' are numeric
    df['Qin (AF)'] = pd.to_numeric(df['Qin'], errors='coerce')
    df['Qout (AF)'] = pd.to_numeric(df['Qout'], errors='coerce')
    df['Precip (AF)'] = pd.to_numeric(df['Total Monthly Precip (AF)'], errors='coerce')
    df['Evap (AF)'] = pd.to_numeric(df['Reach Evap (AF)'], errors='coerce')

    # Calculate the flow difference ( Residual = output - input )
    df['Residual (AF)'] = (df['Qout (AF)'] + df['Evap (AF)']) - (df['Qin (AF)'] + df['Precip (AF)'])

    return df

final_df = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/compiled_data/final_water_budget_af.csv')
residual = calculate_wb_residual(final_df)
residual = residual.drop_duplicates()
residual.to_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/compiled_data/final_water_budget_norunoff.csv', index = False)

## Mason Drainshed Water Budget Calculations & Data

In [64]:
# ------------ #
# Compile Data
# ------------ #

gw_mason = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/groundwater_out/mason_gw_final_kaf.csv')
gw = gw_mason[gw_mason['Year'] == 2015]
gw.drop(columns=['Stress Period', 'Year'], inplace=True)
days_in_month = {
    1: 31, 2: 28, 3: 31, 4: 30, 5: 31, 6: 30,
    7: 31, 8: 31, 9: 30, 10: 31, 11: 30, 12: 31
}
gw['Days in Month'] = gw['Month'].map(days_in_month)
gw['GW (AF)'] = (gw['Total GW Contribution (AF/day)'])
gw['Month'] = gw['Month'].apply(lambda x: f'2024-{int(x):02d}-01')
gw['Month'] = pd.to_datetime(gw['Month'], format='%Y-%m-%d')
gw = gw[['Month', 'GW (AF)']]


precip_mason = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/precip_out/mason_precip_monthly.csv')
precip_mason['Month'] = pd.to_datetime(precip_mason['Month'], errors='coerce')

runoff_mason = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/irrigation_runoff_out/mason_runoff_final.csv')
runoff_mason.rename(columns={'runoff(af)': 'SR (AF)'}, inplace=True)
month_mapping = {
    "jan": "01", "feb": "02", "mar": "03", "apr": "04",
    "may": "05", "jun": "06", "jul": "07", "aug": "08",
    "sep": "09", "oct": "10", "nov": "11", "dec": "12"
}
runoff_mason['Month'] = runoff_mason['month'].str.lower().map(month_mapping)
runoff_mason['Month'] = pd.to_datetime("2024-" + runoff_mason['Month'], format='%Y-%m', errors='coerce')

pet_mason = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/pet_out/mason_pet.csv')
pet_mason['Month'] = pd.to_datetime(pet_mason['Month'].astype(str) + '-2024', format='%m-%Y', errors='coerce')


flow_mason = pd.read_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/drainflow_out/mason_final_monthly_flow.csv')
flow_mason['Month'] = pd.to_datetime(flow_mason['Month'], format='%Y-%m')
flow_mason.rename(columns={'Inflow (AF)': 'Qin (AF)'}, inplace=True)
flow_mason.rename(columns={'Outflow (AF)': 'Qout (AF)'}, inplace=True)


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  gw.drop(columns=['Stress Period', 'Year'], inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  gw['Days in Month'] = gw['Month'].map(days_in_month)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  gw['GW (AF)'] = (gw['Total GW Contribution (AF/day)'])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col

In [60]:
runoff_mason

Unnamed: 0,Month,SR (AF),Total Area (acres)
0,apr,848.771714,11900.811677
1,aug,1794.23037,11900.811677
2,dec,620.155967,11900.811677
3,feb,680.720601,11900.811677
4,jan,664.617509,11900.811677
5,jul,1974.593528,11900.811677
6,jun,1573.388763,11900.811677
7,mar,636.238241,11900.811677
8,may,1404.034787,11900.811677
9,nov,590.009663,11900.811677


In [67]:
## ------------------------------ ##
##   Merge into one .csv file (KAF)
## ------------------------------ ##

drainflow_selected = flow_mason[['Month', 'Qin (AF)', 'Qout (AF)']]

pet_selected = pet_mason[['Month', 'Evap (AF)']]

precip_selected = precip_mason[['Month', 'Precip(AF)']]

surface_runoff_selected = runoff_mason[['Month', 'SR (AF)']]

# Merge the DataFrames on "Reach" and "Month"
merged_df = drainflow_selected.merge(pet_selected, on=['Month'], how='outer')
merged_df = merged_df.merge(precip_selected, on=['Month'], how='outer')
merged_df = merged_df.merge(surface_runoff_selected, on=['Month'], how='outer')
merged_df = merged_df.merge(gw, on=['Month'], how='outer')

# Display the final merged DataFrame
merged_df.to_csv('/content/drive/MyDrive/Data/Mason Water Budget/data_output/compiled_data/mason_final_water_budget.csv')