# Groundwater Forcing Generator
**Author: Jun Sasaki | Created: 2025-09-21 Updated: 2025-09-21**

**Purpose:**
This notebook demonstrates how to generate groundwater forcing files for FVCOM using the `GroundwaterNetCDFGenerator`.

**Important Note on Node Indexing:**
- FVCOM uses **1-based node numbering** (nodes 1, 2, 3, ...)
- Python uses **0-based array indexing** (indices 0, 1, 2, ...)
- This notebook uses 1-based node IDs when referring to FVCOM nodes
- When accessing Python arrays, we convert: `array[node_id - 1]`

## Contents
1. Basic Setup and Imports
2. Constant Values (Simplest)
3. Selective Node Forcing (Specific Locations)
4. Node-Varying Values (Spatial Variation)
5. Time-Varying Values (Temporal Variation)
6. Time-Varying Selective Nodes (Combined)
7. CSV Input - Wide Format
8. CSV Input - Long Format
9. Including Dye Concentration
10. Command Line Interface
11. Verification Examples
12. Summary

## 1. Basic Setup and Imports

In [None]:
import numpy as np
from pathlib import Path
from xfvcom.io import GroundwaterNetCDFGenerator
from xfvcom.grid import FvcomGrid
import xarray as xr
import pandas as pd

from IPython.core.magic import register_cell_magic
@register_cell_magic
def skip(line, cell):
    print("This cell is skipped.")

In [None]:
# Load FVCOM grid file (.dat format)
grid_file = Path("~/Github/TB-FVCOM/goto2023/input/TokyoBay18_grd.dat").expanduser()

# Output directories
output_nc = Path("NC")  # NetCDF output directory
output_csv = Path("CSV")  # CSV output directory

# Create output directories if they don't exist
output_nc.mkdir(exist_ok=True)
output_csv.mkdir(exist_ok=True)

# Get grid information
grid = FvcomGrid.from_dat(grid_file, utm_zone=54)
n_nodes = len(grid.x)
print(f"Grid has {n_nodes} nodes")
print(f"NetCDF output directory: {output_nc}")
print(f"CSV output directory: {output_csv}")

## 2. Constant and Uniform Values (Simplest)

The simplest case: apply the same groundwater properties to all nodes for all time steps.

In [None]:
# Create generator with constant values
gen = GroundwaterNetCDFGenerator(
    grid_nc=grid_file,        # Despite the parameter name, it accepts .dat files
    start="2025-01-01T00:00:00Z",
    end="2025-01-07T00:00:00Z",
    dt_seconds=3600,          # hourly
    utm_zone=54,              # Required for .dat files: UTM zone
    flux=0.001,               # m³/s - constant for all nodes
    temperature=15.0,         # °C - constant for all nodes
    salinity=0.5,             # PSU - constant for all nodes
)

# Write to file in NetCDF output directory
output_file = output_nc / "groundwater_constant.nc"
gen.write(output_file)
print(f"Generated: {output_file}")

## 3. Constant Uniform Selective Node Forcing (Specific Locations)

Apply constant and uniform groundwater forcing only to specific nodes (e.g., known spring locations), with all other nodes set to zero.

In [None]:
# Define active groundwater nodes (1-based FVCOM node IDs)
active_gw_nodes = [1443, 1503, 1504, 1506, 1507, 1572, 1573, 1574, 1575, 1576]

# Define constant values for groundwater properties
gw_properties = {
    'flux': 1e-7,        # m/s
    'temperature': 14,  # °C
    'salinity': 30,      # PSU
    'dye': 300.0,         # Dye concentration for tracking $\mu$ mol/L
}

print(f"Active nodes: {len(active_gw_nodes)} out of {n_nodes} total")

In [None]:
# Create arrays with values only at active nodes
flux_selective = np.zeros(n_nodes)
temp_selective = np.zeros(n_nodes)
salt_selective = np.zeros(n_nodes)
dye_selective = np.zeros(n_nodes)

for node_id in active_gw_nodes:
    idx = node_id - 1  # Convert 1-based to 0-based index
    if 0 <= idx < n_nodes:
        flux_selective[idx] = gw_properties['flux']
        temp_selective[idx] = gw_properties['temperature']
        salt_selective[idx] = gw_properties['salinity']
        dye_selective[idx] = gw_properties['dye']

# Generate NetCDF
year=2021
gen_selective = GroundwaterNetCDFGenerator(
    grid_nc=grid_file,
    start=f"{year}-01-01T00:00:00Z",
    end=f"{year+1}-01-01T00:00:00Z",
    dt_seconds=86400,
    utm_zone=54,
    flux=flux_selective,
    temperature=temp_selective,
    salinity=salt_selective,
    dye=dye_selective,
)

output_selective = output_nc / f"gwf_{year}.nc"
gen_selective.write(output_selective)
print(f"Generated: {output_selective}")

## 4. Constant Node-Varying Values (Spatial Variation)

Different values for each node, but constant in time.

In [None]:
# Create node-specific values
flux_by_node = np.random.uniform(0.0, 0.01, n_nodes)  # Random flux values
temp_by_node = np.random.uniform(10.0, 20.0, n_nodes) # Random temperatures
salt_by_node = np.random.uniform(0.0, 2.0, n_nodes)   # Random salinities

# Create generator
gen = GroundwaterNetCDFGenerator(
    grid_nc=grid_file,
    start="2025-01-01T00:00:00Z",
    end="2025-01-02T00:00:00Z",
    dt_seconds=3600,
    utm_zone=54,
    flux=flux_by_node,
    temperature=temp_by_node,
    salinity=salt_by_node,
)

output_file = output_nc / "groundwater_node_varying.nc"
gen.write(output_file)
print(f"Generated: {output_file}")

## 5. Time-Varying Values (Temporal Variation)

Values that change over time (e.g., tidal influence on flux).

In [None]:
# Define time parameters
start = "2025-01-01T00:00:00Z"
end = "2025-01-03T00:00:00Z"
dt_hours = 6  # 6-hourly data

# Calculate number of time steps
times = pd.date_range(start, end, freq=f"{dt_hours}h", inclusive="both")
n_times = len(times)
print(f"Time steps: {n_times}")

# Create time-varying flux (e.g., tidal influence)
flux_data = np.zeros((n_nodes, n_times))
for i in range(n_nodes):
    phase = i * 2 * np.pi / n_nodes
    flux_data[i, :] = 0.005 * (1 + np.sin(2 * np.pi * np.arange(n_times) / 4 + phase))

# Create generator
gen = GroundwaterNetCDFGenerator(
    grid_nc=grid_file,
    start=start,
    end=end,
    dt_seconds=dt_hours * 3600,
    utm_zone=54,
    flux=flux_data,
    temperature=15.0,  # Constant temperature
    salinity=0.0,      # Fresh water
)

output_file = output_nc / "groundwater_timevarying.nc"
gen.write(output_file)
print(f"Generated: {output_file}")

## 6. Time-Varying Selective Nodes (Combined)

Combine selective nodes with time-varying patterns.

In [None]:
# Time parameters
start_date = "2025-01-01T00:00:00Z"
end_date = "2025-01-07T00:00:00Z"
dt_seconds = 3600
times_utc = pd.date_range(start_date, end_date, freq=f"{dt_seconds}s", inclusive="both")
n_timesteps = len(times_utc)

# Initialize 2D arrays with zeros
flux_selective_2d = np.zeros((n_nodes, n_timesteps))
temp_selective_2d = np.zeros((n_nodes, n_timesteps))

# Create time-varying patterns for active nodes only
for node_id in active_gw_nodes:
    idx = node_id - 1
    if 0 <= idx < n_nodes:
        t_hours = np.arange(n_timesteps)
        
        # Tidal variation for flux
        tidal_period = 12.42
        phase = (node_id % 10) * np.pi / 5
        flux_selective_2d[idx, :] = 0.001 * (1 + 0.3 * np.sin(2 * np.pi * t_hours / tidal_period + phase))
        
        # Diurnal variation for temperature
        temp_selective_2d[idx, :] = 12.5 + 0.5 * np.sin(2 * np.pi * t_hours / 24)

# Generate NetCDF
gen = GroundwaterNetCDFGenerator(
    grid_nc=grid_file,
    start=start_date,
    end=end_date,
    dt_seconds=dt_seconds,
    utm_zone=54,
    flux=flux_selective_2d,
    temperature=temp_selective_2d,
    salinity=0.5,
)

output_file = output_nc / "groundwater_selective_timevarying.nc"
gen.write(output_file)
print(f"Generated: {output_file}")

## 7. CSV Input - Wide Format

Read time-varying data from CSV files where each column represents a node.

In [None]:
# Select nodes for CSV example (1-based FVCOM node IDs)
selected_nodes = [101, 201, 301, 401, 501]

# Create time series
times_local = pd.date_range("2025-01-01", "2025-01-07", freq="6h", tz="Asia/Tokyo")
n_times = len(times_local)

# Create flux CSV
flux_df = pd.DataFrame({'datetime': times_local})
for node_id in selected_nodes:
    base_flux = 0.001 + 0.002 * np.random.rand()
    hours = np.array([t.hour for t in times_local])
    daily_var = 0.3 * np.sin(2 * np.pi * hours / 24 - np.pi/2)
    flux_df[f'node_{node_id}'] = base_flux * (1 + daily_var)

flux_csv = output_csv / "groundwater_flux_timeseries.csv"
flux_df.to_csv(flux_csv, index=False)
print(f"Created: {flux_csv}")
print(flux_df.head())

In [None]:
def read_spatiotemporal_csv(csv_file, selected_nodes, total_nodes):
    """Read CSV and convert to 2D array for FVCOM."""
    df = pd.read_csv(csv_file)
    df['datetime'] = pd.to_datetime(df['datetime'])
    
    times = pd.DatetimeIndex(df['datetime'])
    n_times = len(times)
    
    # Initialize with zeros
    data_array = np.zeros((total_nodes, n_times))
    
    # Fill data for selected nodes
    for node_id in selected_nodes:  # 1-based
        col_name = f'node_{node_id}'
        if col_name in df.columns:
            data_array[node_id - 1, :] = df[col_name].values  # Convert to 0-based
    
    return times, data_array

# Read CSV
flux_times, flux_array = read_spatiotemporal_csv(flux_csv, selected_nodes, n_nodes)
print(f"Flux array shape: {flux_array.shape}")

## 8. CSV Input - Long Format

Alternative CSV format with node_id column (more memory efficient for sparse data).

In [None]:
# Create long format CSV
long_data = []
for t_idx, time in enumerate(flux_times):
    for node_id in selected_nodes:  # 1-based
        long_data.append({
            'datetime': time,
            'node_id': node_id,  # Store 1-based
            'flux': flux_array[node_id - 1, t_idx],
            'temperature': 15.0,
        })

long_df = pd.DataFrame(long_data)
long_csv = output_csv / "groundwater_long_format.csv"
long_df.to_csv(long_csv, index=False)
print(f"Created: {long_csv}")
print(long_df.head())

## 9. Including Dye Concentration

Add dye tracer for tracking groundwater influence.

In [None]:
# Constant dye for all nodes
gen = GroundwaterNetCDFGenerator(
    grid_nc=grid_file,
    start="2025-01-01T00:00:00Z",
    end="2025-01-07T00:00:00Z",
    dt_seconds=3600,
    utm_zone=54,
    flux=0.001,
    temperature=15.0,
    salinity=0.5,
    dye=100.0,  # Add dye concentration
)

output_file = output_nc / "groundwater_with_dye.nc"
gen.write(output_file)
print(f"Generated: {output_file} with dye concentration")

In [None]:
# Time-varying dye with pulse releases
n_timesteps = 168  # 1 week hourly
dye_selective = np.zeros((n_nodes, n_timesteps))

for node_id in active_gw_nodes:
    idx = node_id - 1
    if 0 <= idx < n_nodes:
        # Weekly pulse releases
        for pulse_start in range(0, n_timesteps, 24 * 7):
            for t in range(min(24, n_timesteps - pulse_start)):
                dye_selective[idx, pulse_start + t] = 100.0 * np.exp(-t / 12)

print(f"Created dye pulse pattern for {len(active_gw_nodes)} nodes")

## 10. Command Line Interface

You can also use the CLI tool for simple cases:

In [None]:
# Display CLI help
!xfvcom-make-groundwater-nc --help

### CLI Examples:

```bash
# Constant values
xfvcom-make-groundwater-nc grid.nc \
    --start 2025-01-01T00:00Z --end 2025-01-07T00:00Z \
    --flux 0.001 --temperature 15.0 --salinity 0.0 --dye 100.0

# For .dat grid files
xfvcom-make-groundwater-nc grid.dat --utm-zone 54 \
    --start 2025-01-01T00:00Z --end 2025-01-02T00:00Z \
    --flux 0.005 --temperature 18.0 --salinity 32.0
```

## 11. Verification Examples

Check the generated NetCDF files.

In [None]:
# Verify constant value output
nc_file = output_nc / "groundwater_constant.nc"
with xr.open_dataset(nc_file, decode_times=False) as ds:
    print(f"Checking {nc_file}")
    print("Dataset dimensions:")
    print(f"  Time steps: {ds.sizes['time']}")
    print(f"  Nodes: {ds.sizes['node']}")
    print("\nVariables:")
    for var in ds.data_vars:
        print(f"  {var}: {ds[var].dims}")

In [None]:
# Verify selective nodes
nc_file = output_nc / "groundwater_selective_nodes.nc"
with xr.open_dataset(nc_file, decode_times=False) as ds:
    print(f"Checking {nc_file}")
    flux_data = ds.groundwater_flux.values
    flux_sum = flux_data.sum(axis=0)  # Sum over time
    active_indices = np.where(flux_sum > 0)[0]
    active_node_ids = active_indices + 1  # Convert to 1-based
    
    print(f"Active nodes (1-based): {list(active_node_ids)}")
    print(f"Number of active nodes: {len(active_node_ids)}")
    print(f"Total flux per timestep: {flux_sum[active_indices].sum():.6f} m/s")

## 12. Summary

This notebook demonstrated groundwater forcing generation for FVCOM, progressing from simple to complex:

### Complexity Progression:
1. **Constant values** - Simplest case, same values everywhere
2. **Selective nodes** - Apply forcing only to specific locations
3. **Spatial variation** - Different values per node
4. **Temporal variation** - Time-varying patterns
5. **Combined variations** - Both spatial and temporal
6. **CSV input** - Read from external data files
7. **Dye tracers** - Add passive tracer concentration

### Key Points for Node Indexing:
- **FVCOM uses 1-based node numbering** (Fortran convention)
- **CSV files use 1-based node IDs** (e.g., `node_101`, `node_201`)
- **Python arrays use 0-based indexing**: convert with `array[node_id - 1]`
- NetCDF output maintains proper alignment: array position 0 → FVCOM node 1

### Input Options:
The `GroundwaterNetCDFGenerator` accepts:
- **Scalars**: Single value for all nodes and times
- **1D arrays**: Different values per node (constant in time)
- **2D arrays**: Full spatio-temporal variation (nodes × time)

### Practical Applications:
1. **Submarine groundwater discharge** - Coastal freshwater inputs
2. **Contaminant tracking** - Using dye tracers
3. **Spring locations** - Point sources of groundwater
4. **Tidal pumping** - Time-varying flux patterns
5. **Temperature impacts** - Cool groundwater affecting stratification

### Next Steps:
- Integrate with field measurements
- Couple with salinity observations
- Use dye tracers for model validation
- Apply realistic temporal patterns from monitoring data