# Transport Case Study: Group 0 (Demo)

**Scenario**: Industrial Trichloroethylene (TCE) Spill

**Group Members**: 
- TODO: Add your names

**Date**: TODO

---

## 1. Overview and Learning Objectives

### Problem Statement
A 30-day trichloroethylene (TCE) spill from an industrial facility has contaminated the groundwater in the Limmat Valley. Your task is to:
1. Model the transport of TCE through the aquifer over a 2-year period
2. Analyze how the existing well field (pumping and injection wells) affects plume migration
3. Assess whether contamination reaches the Limmat River or compliance monitoring points
4. **(Optional for bonus credit)** Verify your numerical model against analytical solutions

### Learning Objectives
By completing this case study, you will:
- Set up a coupled MODFLOW-MT3DMS transport model
- Apply the telescope approach to refine resolution around source and wells
- Define contaminant source terms using the SSM package
- Analyze well-contaminant interactions (capture zones, spreading)
- **(Optional for bonus credit)** Verify numerical results with analytical solutions
- Communicate findings in a professional modeling report

### Deliverables
1. This completed notebook with all code executed and results displayed
2. Completed `case_config_transport.yaml` with justified parameter choices
3. Professional report (3-4 pages PDF) summarizing methods, results, and conclusions
4. **(Optional for bonus: +5-10%)** Analytical comparison section with plots and discussion

### Key Questions to Answer
- Will the pumping wells capture the TCE plume before it reaches the Limmat River?
- What is the maximum extent of the contamination (area where C > 5 mg/L)?
- When will contamination reach monitoring locations?
- How do injection wells/Sickergalerie affect plume spreading?
- **(Optional)** How well does a 1D analytical solution predict plume behavior compared to the full 2D/3D numerical model?

---
## 2. Workflow Summary

### Transport Case Study Workflow

This case study follows a simpler workflow than the flow case study (no scenario variations):

```
1. Load fresh base parent model (independent from your flow case results)
   ↓
2. Load well locations from flow case study (case_config.yaml)
   ↓
3. Define transport submodel domain (around wells and source area)
   ↓
4. Create refined grid for submodel (5m cells for better resolution)
   ↓
5. Extract boundary conditions from parent model
   ↓
6. Set up MODFLOW submodel with wells (steady-state flow)
   ↓
7. Run flow model and verify convergence
   ↓
8. Set up MT3DMS transport model (transient concentrations)
   ↓
9. Define contaminant source term (SSM package)
   ↓
10. Run 2-year transport simulation
    ↓
11. Post-process: concentration maps, breakthrough curves, mass balance
    ↓
12. Analyze well-contaminant interactions
    ↓
13. OPTIONAL (bonus): Analytical comparison (Ogata-Banks 1D)
    ↓
14. Interpret results and write professional report
```

### Key Concept: Steady Flow + Transient Transport

We assume **steady-state flow** (heads don't change with time) but **transient transport** (concentrations evolve over time). This is the standard approach for long-term contamination problems because:
- Groundwater flow reaches equilibrium quickly (days to weeks)
- Contaminant transport is much slower (months to years)
- Allows us to focus on transport processes without rerunning flow at each time step

### Differences from Flow Case Study

| Aspect | Flow Case Study | Transport Case Study |
|--------|----------------|---------------------|
| Starting model | Base parent model | Same fresh base parent model |
| Wells | Student implements from concession | **Reuse from case_config.yaml** |
| Scenarios | 3 stages + parameter variations | **Single run: wells + transport** |
| Complexity | 3-stage workflow | **Simpler: 1-stage** |
| Focus | Flow system response | **Contaminant fate and transport** |
| Analysis | Drawdown, river leakage | **Plume migration, breakthrough** |
| Time dimension | Steady-state | **Steady flow + transient transport** |
| Analytical check | Not applicable | **Optional (bonus credit)** |

### Time Estimate
- Setup and configuration: 2-3 hours
- Model execution and debugging: 2-3 hours
- Analysis and visualization: 2-3 hours
- **(Optional) Analytical comparison: 0.5-1 hour** → **+5-10% bonus**
- Report writing: 2-3 hours
- **Total: ~8-10 hours** (or 10-11 hours with bonus section)

---
## 3. Configuration and Setup

### Import Libraries

Import all necessary Python libraries for modeling, analysis, and visualization.

In [None]:
# Import required libraries
import sys
import os
import numpy as np
import pickle
import geopandas as gpd
import matplotlib.pyplot as plt
import matplotlib.lines as mlines
import matplotlib.patches as mpatches
from shapely.geometry import Point, Polygon
from shapely.affinity import rotate
import flopy
from flopy.discretization import StructuredGrid

# print current working directory
print("Current working directory: ", os.getcwd())

# Add the support repo to the path
sys.path.append(os.path.abspath('../../../SUPPORT_REPO/src'))
sys.path.append(os.path.abspath('../../../SUPPORT_REPO/src/scripts/scripts_exercises'))

# Import local modules
import case_utils 
from data_utils import download_named_file, get_default_data_folder
import grid_utils
from print_images import display_image
import plot_utils

### Load Configuration

Load transport scenario configuration from `case_config_transport.yaml`.

In [None]:
# Load flow case study configuration
CASE_YAML = 'case_config.yaml'
cfg = case_utils.load_yaml(CASE_YAML)

# Get group configuration
group_number = cfg['group'].get('number', 0)
if not isinstance(group_number, int) or group_number < 0 or group_number > 8:
    raise ValueError("Group number must be an integer between 0 and 8.")

print(f"Group number: {group_number}")

# Load and merge the transport configuration
TRANSPORT_YAML = 'case_config_transport.yaml'
cfg_transport = case_utils.load_yaml(TRANSPORT_YAML)

# Merge transport config into cfg (transport-specific keys added to main config)
# This keeps both configs accessible from a single object
cfg.update(cfg_transport)

print(f"Configuration loaded successfully. Available sections: {list(cfg.keys())}")

---
## 4. Load Parent Flow Model

### Download and Load Base Model

Load the fresh base parent model (independent from your flow case study results). This ensures a known-good starting point for transport modeling.

**Important**: We use the baseline model, not your modified flow case study model, to:
- Avoid error propagation from flow modeling
- Start from a verified, converged flow field
- Simplify the workflow

In [None]:
# Download parent base model and save it to the transport workspace
# To make sure not to carry over any unintended changes from the flow case study, 
# we download a fresh copy of the base model specified in the transport configuration.

# After cfg.update(cfg_transport), cfg['model'] now contains the transport model config
# with workspace pointing to the transport subdirectory
parent_base_model_name = cfg['model']['data_name']
parent_workspace = os.path.expanduser(cfg['model']['workspace'])

# Download to the transport-specific directory
parent_base_model_path = download_named_file(
    parent_base_model_name, 
    dest_folder=parent_workspace,
    data_type=None,  # Don't append additional subdirectory
)

# Handle zip file extraction if needed
if parent_base_model_path.endswith('.zip'):
    import zipfile
    extract_path = os.path.dirname(parent_base_model_path)
    with zipfile.ZipFile(parent_base_model_path, 'r') as zip_ref:
        zip_ref.extractall(extract_path)
    parent_base_model_path = os.path.join(extract_path, cfg['model']['namefile'])

print(f'Downloaded the parent base model to: {parent_base_model_path}')
print(f'Model workspace: {parent_workspace}')

### Verify Parent Model

Run the parent model to verify it converges and produces reasonable flow field.

In [None]:
# ----- Load model results ----- #
parent_base_namefile = os.path.basename(parent_base_model_path)
m_parent_base = flopy.modflow.Modflow.load(
    parent_base_namefile, 
    model_ws=parent_workspace, 
    check=False, 
    forgive=False, 
    exe_name='mfnwt'
)
# Check if heads file exists, if not run the model
parent_hds_path = os.path.join(parent_workspace, f"{m_parent_base.name}.hds")
if not os.path.exists(parent_hds_path):
    print("Parent model heads file not found. Running parent model...")
    success, buff = m_parent_base.run_model(silent=True, report=True)
    if not success:
        raise RuntimeError("Parent model failed to run")
    print("✓ Parent model run completed")

# Load and visualize groundwater heads
headobj = flopy.utils.HeadFile(parent_hds_path)
print(f'Heads loaded from {parent_hds_path}')
heads = headobj.get_data()[0]  # Layer 0, stress period 0


# ----- Create visualization ----- #
# Create visualization
fig, ax = plt.subplots(figsize=(16, 12))

# Plot model with heads
pmv = flopy.plot.PlotMapView(model=m_parent_base, ax=ax)

# Plot head distribution as colored background
heads_masked = np.ma.masked_where(m_parent_base.bas6.ibound.array[0] <= 0, heads)
im = pmv.plot_array(heads_masked, alpha=0.6, cmap='Blues')

# Add head contours
contour_levels = np.linspace(np.nanmin(heads_masked), np.nanmax(heads_masked), 15)
cont = pmv.contour_array(heads_masked, levels=contour_levels, colors='black', 
                        linewidths=1.5, linestyles='-')
ax.clabel(cont, inline=True, fontsize=9, fmt='%.1f m')

# Plot model grid (light)
pmv.plot_grid(color='gray', alpha=0.3, linewidth=0.5)

# Add colorbar
cbar = plt.colorbar(im, ax=ax, shrink=0.3, pad=0.02)
cbar.set_label('Hydraulic Head (m a.s.l.)', fontsize=12)

# Formatting
ax.set_title(f'Parent Base Model - Groundwater Flow Field\nGroup {group_number} - Hydraulic Heads', 
             fontsize=14, fontweight='bold')
ax.set_xlabel('X Coordinate (m)', fontsize=12)
ax.set_ylabel('Y Coordinate (m)', fontsize=12)
ax.set_aspect('equal')

# Add text box with model info
info_text = f'Model: {m_parent_base.name}\nGrid: {m_parent_base.nrow}×{m_parent_base.ncol}\nCell size: {m_parent_base.dis.delr[0]:.0f}m'
ax.text(0.02, 0.98, info_text, transform=ax.transAxes, fontsize=10,
        verticalalignment='top', bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))

plt.tight_layout()
plt.show()

---
## 5. Load Well Data from Flow Case Study

### Load Wells from case_config.yaml

Load the well locations and pumping/injection rates from your flow case study. Wells are **reused**, not reimplemented.

**Key Tasks:**
1. Read well data from flow case `case_config.yaml` or concession CSV
2. Identify which wells are **pumping** (negative Q) vs **injection** (positive Q)
3. Map well locations to parent model grid
4. Visualize well locations relative to planned source location

In [None]:
# TODO: Load well data from case_config.yaml
# Parse well information
# Separate pumping vs injection wells
# Map to grid coordinates

# Load the rotated model grid for visualization
print(f"Constructing modelgrid_path:")
modelgrid_path = os.path.join(os.path.dirname(parent_base_model_path), f"{parent_base_namefile.replace('.nam', '')}_modelgrid.pkl")
with open(modelgrid_path, 'rb') as f:
    parent_modelgrid = pickle.load(f)

print(f"Parent model grid loaded")
print(f"Grid rotation: {parent_modelgrid.angrot} degrees")
print(f"Grid extent: X [{parent_modelgrid.extent[0]:.1f}, {parent_modelgrid.extent[1]:.1f}]")
print(f"             Y [{parent_modelgrid.extent[2]:.1f}, {parent_modelgrid.extent[3]:.1f}]")

# Get well locations for the specified group
scenario = case_utils.get_scenario_for_group(CASE_YAML, group_number)
concession_id = scenario.get('concession', None)
if concession_id is None:
    raise ValueError(f"Concession ID not defined for group {group_number}")

# Load and filter wells by concession
well_data_path = download_named_file(name='wells', data_type='gis')
wells_gdf = gpd.read_file(well_data_path, layer='GS_GRUNDWASSERFASSUNGEN_OGD_P')
wells_gdf = case_utils.filter_wells_by_concession(wells_gdf, concession_id)

print(f"\nWells for concession {concession_id}:")
print(wells_gdf[['GWR_ID', 'GWR_PREFIX', 'FASSART']])

# Identify source location from transport scenario
# Navigate through transport_scenarios -> options -> find id matching group_number
transport_scenarios = cfg.get('transport_scenarios', {})
scenario_options = transport_scenarios.get('options', [])

# Find the scenario matching the group number
scenario_config = None
for option in scenario_options:
    if option.get('id') == group_number:
        scenario_config = option
        break

if scenario_config is None:
    raise ValueError(f"No transport scenario found for group {group_number}")

print(f"\nTransport scenario: {scenario_config.get('title', 'Unknown')}")
print(f"Contaminant: {scenario_config.get('contaminant', 'Unknown')}")

# Extract source location (relative coordinates from config)
source_config = scenario_config.get('source', {})
source_location = source_config.get('location', {})
source_easting_relative = source_location.get('easting', None)
source_northing_relative = source_location.get('northing', None)

if source_easting_relative is None or source_northing_relative is None:
    raise ValueError(f"Source location not defined for group {group_number} in case_config_transport.yaml")

# Convert relative coordinates to absolute Swiss coordinates
# Use the first well (or centroid of wells) as reference point
reference_well = wells_gdf.iloc[0]
reference_easting = reference_well.geometry.x
reference_northing = reference_well.geometry.y

source_easting = reference_easting + source_easting_relative
source_northing = reference_northing + source_northing_relative

print(f"\nSource location:")
print(f"  Reference well: {reference_well['GWR_PREFIX']} at ({reference_easting:.1f}, {reference_northing:.1f})")
print(f"  Relative offset: ({source_easting_relative:+.1f}, {source_northing_relative:+.1f}) m")
print(f"  Absolute coordinates (Swiss LV03/95): ({source_easting:.1f}, {source_northing:.1f}) m")

# Calculate distance and direction from source to each well
wells_gdf['distance_to_source'] = np.sqrt(
    (wells_gdf.geometry.x - source_easting)**2 + 
    (wells_gdf.geometry.y - source_northing)**2
)

# Calculate bearing from source to well (degrees from North)
wells_gdf['bearing_from_source'] = np.degrees(
    np.arctan2(wells_gdf.geometry.x - source_easting, 
               wells_gdf.geometry.y - source_northing)
)

print(f"\nWell positions relative to source:")
for idx, row in wells_gdf.iterrows():
    print(f"  {row['GWR_PREFIX']} ({row['FASSART']}): "
          f"{row['distance_to_source']:.1f} m at {row['bearing_from_source']:.1f}° from N")

# Identify closest well to source
closest_well_idx = wells_gdf['distance_to_source'].idxmin()
closest_well = wells_gdf.loc[closest_well_idx]
print(f"\nClosest well to source: {closest_well['GWR_PREFIX']} "
      f"({closest_well['distance_to_source']:.1f} m away)")

# Create source point geometry for plotting
source_point = gpd.GeoDataFrame(
    {'id': ['source'], 'type': ['contamination_source']},
    geometry=[Point(source_easting, source_northing)],
    crs=wells_gdf.crs
)

### Visualize Well Field

Plot well locations on model grid to understand spatial arrangement.

In [None]:
# Visualize wells and source on parent model
case_utils.plot_wells_on_model(m_parent_base, modelgrid=parent_modelgrid, wells_gdf=wells_gdf, 
                               concession_id=concession_id, source_point=source_point)

---
## 6. Define Transport Submodel Domain

### Estimate Plume Travel Distance

Before defining the submodel domain, estimate how far the plume might travel in 2 years:

**Travel distance** = velocity × time = (K·i/n) × t

Where:
- K = hydraulic conductivity (m/day) - from parent model
- i = hydraulic gradient (m/m) - from parent model heads
- n = effective porosity - from config (typically 0.25)
- t = simulation time (2 years = 730 days)

**Rule of thumb**: Buffer should be at least 1.5× to 2× estimated travel distance to ensure plume stays within domain.

In [None]:
# Extract hydraulic conductivity from parent model
lpf = m_parent_base.get_package('LPF')
if lpf is None:
    upw = m_parent_base.get_package('UPW')
    if upw is None:
        raise ValueError("No LPF or UPW package found in parent model")
    hk = upw.hk.array  # Hydraulic conductivity (m/day)
else:
    hk = lpf.hk.array  # Hydraulic conductivity (m/day)

# Get average K in layer 0 near the source location
# Use the source location to find nearby cells
layer = 0
source_row, source_col = parent_modelgrid.intersect(source_easting, source_northing)[:2]

# Sample K in a 3x3 neighborhood around source
row_start = max(0, source_row - 1)
row_end = min(hk.shape[1], source_row + 2)
col_start = max(0, source_col - 1)
col_end = min(hk.shape[2], source_col + 2)

k_local = hk[layer, row_start:row_end, col_start:col_end]
k_avg = np.mean(k_local[k_local > 0])  # Average non-zero K values

print(f"\nHydraulic Conductivity near source:")
print(f"  Layer {layer}, Row {source_row}, Col {source_col}")
print(f"  Local K range: {np.min(k_local[k_local > 0]):.2f} to {np.max(k_local):.2f} m/day")
print(f"  Average K: {k_avg:.2f} m/day")

# Calculate hydraulic gradient from head distribution
# Note: 'heads' is already 2D (layer 0 extracted in previous cell)
# Sample gradient along flow direction near source
dx = parent_modelgrid.delr[source_col]  # Cell size in x-direction
dy = parent_modelgrid.delc[source_row]  # Cell size in y-direction

# Calculate gradient using central differences
if source_col > 0 and source_col < heads.shape[1] - 1:
    dh_dx = (heads[source_row, source_col + 1] - 
             heads[source_row, source_col - 1]) / (2 * dx)
else:
    dh_dx = 0.0

if source_row > 0 and source_row < heads.shape[0] - 1:
    dh_dy = (heads[source_row + 1, source_col] - 
             heads[source_row - 1, source_col]) / (2 * dy)
else:
    dh_dy = 0.0

# Hydraulic gradient magnitude
gradient = np.sqrt(dh_dx**2 + dh_dy**2)

print(f"\nHydraulic Gradient near source:")
print(f"  dh/dx: {dh_dx:.6f}")
print(f"  dh/dy: {dh_dy:.6f}")
print(f"  Gradient magnitude: {gradient:.6f}")

# Get porosity from transport configuration
porosity = scenario_config['transport']['porosity']
print(f"\nEffective porosity: {porosity}")

# Calculate seepage velocity (Darcy velocity / porosity)
darcy_velocity = k_avg * gradient  # m/day
seepage_velocity = darcy_velocity / porosity  # m/day

print(f"\nVelocity calculations:")
print(f"  Darcy velocity (q): {darcy_velocity:.4f} m/day")
print(f"  Seepage velocity (v): {seepage_velocity:.4f} m/day ({seepage_velocity * 365:.1f} m/year)")

# Estimate 2-year travel distance
simulation_time_days = 730  # 2 years
travel_distance = seepage_velocity * simulation_time_days  # meters

print(f"\n2-year plume travel distance estimate:")
print(f"  Distance = velocity × time")
print(f"  Distance = {seepage_velocity:.4f} m/day × {simulation_time_days} days")
print(f"  Distance ≈ {travel_distance:.0f} m")

# Suggest buffer distances for submodel
# Rule of thumb: 1.5× to 2× travel distance
buffer_min = 1.5 * travel_distance
buffer_max = 2.0 * travel_distance

print(f"\nRecommended submodel buffer:")
print(f"  Minimum: {buffer_min:.0f} m (1.5× travel distance)")
print(f"  Maximum: {buffer_max:.0f} m (2.0× travel distance)")
print(f"  Suggested: {int(np.ceil(buffer_min / 100) * 100)} m (rounded up to nearest 100 m)")

# Check against configured buffer
config_buffer = scenario_config['submodel']['buffer_north_m']
print(f"\nConfigured buffer in case_config_transport.yaml: {config_buffer} m")
if config_buffer < buffer_min:
    print(f"  ⚠️  WARNING: Configured buffer may be insufficient!")
    print(f"  Consider increasing to at least {int(np.ceil(buffer_min / 100) * 100)} m")
elif config_buffer > buffer_max:
    print(f"  ✓ Buffer is conservative (larger than needed)")
else:
    print(f"  ✓ Buffer is adequate for 2-year simulation")

### Define Submodel Extent

Define the submodel domain boundaries. The domain should:
1. Include the contaminant source location
2. Include all wells from the well field
3. Have sufficient buffer for 2-year plume migration
4. Avoid placing boundaries where steep gradients are expected
5. Align with parent grid cells for easier boundary extraction

**Typical domain**: 600×600 m around source and wells (manageable grid: ~120×120 cells with 5m spacing)

In [None]:
# Define submodel parameters
sub_cell_size = scenario_config['submodel']['cell_size_m']  # from config
parent_cell_size = parent_modelgrid.delr[0]  # parent model cell size

print(f"Parent model cell size: {parent_cell_size} m")
print(f"Submodel cell size: {sub_cell_size} m") 
print(f"Refinement ratio: {parent_cell_size/sub_cell_size}×")

# Get buffer distances from configuration
buffer_north_m = scenario_config['submodel']['buffer_north_m']
buffer_south_m = scenario_config['submodel']['buffer_south_m']
buffer_east_m = scenario_config['submodel']['buffer_east_m']
buffer_west_m = scenario_config['submodel']['buffer_west_m']

print(f"\nBuffer distances from config:")
print(f"  North: {buffer_north_m} m")
print(f"  South: {buffer_south_m} m")
print(f"  East: {buffer_east_m} m")
print(f"  West: {buffer_west_m} m")

# Combine wells and source for extent calculation
# Include source location in the extent calculation
all_points_x = list(wells_gdf.geometry.x) + [source_easting]
all_points_y = list(wells_gdf.geometry.y) + [source_northing]

points_x_min = min(all_points_x)
points_x_max = max(all_points_x)
points_y_min = min(all_points_y)
points_y_max = max(all_points_y)

print(f"\nWells + Source extent:")
print(f"  X: [{points_x_min:.1f}, {points_x_max:.1f}] (span: {points_x_max - points_x_min:.1f} m)")
print(f"  Y: [{points_y_min:.1f}, {points_y_max:.1f}] (span: {points_y_max - points_y_min:.1f} m)")

# Calculate submodel bounds in real-world coordinates (Swiss coordinate system)
submodel_xmin = points_x_min - buffer_west_m
submodel_xmax = points_x_max + buffer_east_m  
submodel_ymin = points_y_min - buffer_south_m
submodel_ymax = points_y_max + buffer_north_m

print(f"\nSubmodel extent with buffers (real-world coordinates):")
print(f"  X: [{submodel_xmin:.1f}, {submodel_xmax:.1f}] (span: {submodel_xmax - submodel_xmin:.1f} m)")
print(f"  Y: [{submodel_ymin:.1f}, {submodel_ymax:.1f}] (span: {submodel_ymax - submodel_ymin:.1f} m)")

# Calculate expected grid dimensions
expected_ncol = int(np.ceil((submodel_xmax - submodel_xmin) / sub_cell_size))
expected_nrow = int(np.ceil((submodel_ymax - submodel_ymin) / sub_cell_size))

print(f"\nExpected submodel grid dimensions:")
print(f"  {expected_nrow} rows × {expected_ncol} cols")
print(f"  Total cells: {expected_nrow * expected_ncol:,}")

# Get parent model grid parameters for coordinate transformation
parent_xll = parent_modelgrid.xoffset
parent_yll = parent_modelgrid.yoffset
parent_rotation = parent_modelgrid.angrot

print(f"\nParent model grid parameters:")
print(f"  Origin (xll, yll): ({parent_xll:.1f}, {parent_yll:.1f})")
print(f"  Rotation angle: {parent_rotation} degrees")

# Convert wells and source from real-world to local model coordinates
# This removes the rotation and offset, putting points in the model's local coordinate system
points_local_coords = []

# Add wells
for idx, well in wells_gdf.iterrows():
    local_x, local_y = parent_modelgrid.get_local_coords(well.geometry.x, well.geometry.y)
    points_local_coords.append((local_x, local_y))
    print(f"Well {well.get('GWR_PREFIX', idx)}: ({well.geometry.x:.1f}, {well.geometry.y:.1f}) -> local ({local_x:.1f}, {local_y:.1f})")

# Add source
source_local_x, source_local_y = parent_modelgrid.get_local_coords(source_easting, source_northing)
points_local_coords.append((source_local_x, source_local_y))
print(f"Source: ({source_easting:.1f}, {source_northing:.1f}) -> local ({source_local_x:.1f}, {source_local_y:.1f})")

# Calculate submodel bounds in local coordinates
points_local_x = [coord[0] for coord in points_local_coords]
points_local_y = [coord[1] for coord in points_local_coords]

local_points_x_min = min(points_local_x)
local_points_x_max = max(points_local_x)
local_points_y_min = min(points_local_y)  
local_points_y_max = max(points_local_y)

print(f"\nWells + Source extent in local coordinates:")
print(f"  X: [{local_points_x_min:.1f}, {local_points_x_max:.1f}]")
print(f"  Y: [{local_points_y_min:.1f}, {local_points_y_max:.1f}]")

# Add buffers in local coordinates
submodel_local_xmin = local_points_x_min - buffer_west_m
submodel_local_xmax = local_points_x_max + buffer_east_m
submodel_local_ymin = local_points_y_min - buffer_south_m  
submodel_local_ymax = local_points_y_max + buffer_north_m

print(f"\nSubmodel bounds in local coordinates (with buffers):")
print(f"  X: [{submodel_local_xmin:.1f}, {submodel_local_xmax:.1f}]")
print(f"  Y: [{submodel_local_ymin:.1f}, {submodel_local_ymax:.1f}]")

# Convert submodel boundary back to real-world coordinates
# Create corner points in local coordinates
local_corners = [
    (submodel_local_xmin, submodel_local_ymin),  # SW
    (submodel_local_xmax, submodel_local_ymin),  # SE  
    (submodel_local_xmax, submodel_local_ymax),  # NE
    (submodel_local_xmin, submodel_local_ymax),  # NW
]

# Transform back to real-world coordinates
real_world_corners = []
for local_x, local_y in local_corners:
    real_x, real_y = parent_modelgrid.get_coords(local_x, local_y)
    real_world_corners.append((real_x, real_y))
    
print(f"\nSubmodel corners in real-world coordinates:")
for i, (x, y) in enumerate(real_world_corners):
    corners = ['SW', 'SE', 'NE', 'NW']
    print(f"  {corners[i]}: ({x:.1f}, {y:.1f})")

# Create the properly aligned submodel boundary polygon
real_world_boundary_coords = real_world_corners + [real_world_corners[0]]
submodel_boundary_poly = Polygon(real_world_boundary_coords)

# Clip the submodel boundary to the parent model boundary
parent_model_boundary_file = download_named_file(
    name='model_boundary',
    data_type='gis'
)
parent_model_boundary = gpd.read_file(parent_model_boundary_file)
clipped_submodel_boundary = submodel_boundary_poly.intersection(parent_model_boundary.geometry[0])

# Create GeoDataFrame for the aligned submodel boundary
submodel_boundary_gdf = gpd.GeoDataFrame(
    [{'geometry': clipped_submodel_boundary, 'name': 'transport_submodel_domain'}],
    crs=parent_modelgrid.crs
)

print(f"\nAligned submodel boundary created:")
print(f"  Area: {clipped_submodel_boundary.area / 1e6:.3f} km²")
print(f"  Perimeter: {clipped_submodel_boundary.length / 1e3:.2f} km")

### Visualize Submodel Domain

Plot the submodel extent on the parent model grid.

In [None]:
# Visualize the aligned submodel domain with source and wells
fig, ax = plt.subplots(figsize=(16, 14))

# Plot parent model
pmv = flopy.plot.PlotMapView(model=m_parent_base, modelgrid=parent_modelgrid, ax=ax)
pmv.plot_grid(color='lightgrey', alpha=0.3, linewidth=0.5)

# Plot ibound
pmv.plot_array(m_parent_base.bas6.ibound.array, alpha=0.3, cmap='RdYlBu', vmin=-1, vmax=1)

# Plot head contours for reference
contour_levels = np.linspace(np.nanmin(heads_masked), np.nanmax(heads_masked), 10)
cont = pmv.contour_array(heads_masked, levels=contour_levels, colors='blue', 
                        linewidths=1, linestyles='--', alpha=0.5)
ax.clabel(cont, inline=True, fontsize=8, fmt='%.0f m')

# Plot aligned submodel boundary
submodel_boundary_gdf.plot(ax=ax, facecolor='orange', alpha=0.2, edgecolor='red', 
                          linewidth=3, label='Transport Submodel Domain', zorder=3)

# Plot wells
wells_gdf.plot(ax=ax, color='blue', markersize=150, label='Wells', zorder=5,
               marker='o', edgecolors='white', linewidth=2)

# Plot source location
source_point.plot(ax=ax, color='red', markersize=300, label='Contamination Source', 
                 zorder=6, marker='*', edgecolors='darkred', linewidth=2)

# Add well labels
for idx, well in wells_gdf.iterrows():
    ax.annotate(well['GWR_PREFIX'], 
                xy=(well.geometry.x, well.geometry.y),
                xytext=(10, 10), textcoords='offset points',
                fontsize=9, color='blue', fontweight='bold',
                bbox=dict(boxstyle='round,pad=0.3', facecolor='white', alpha=0.7))

# Add scale information text box
domain_info = (f"Domain: {submodel_xmax - submodel_xmin:.0f}m × {submodel_ymax - submodel_ymin:.0f}m\n"
               f"Grid: {expected_nrow}×{expected_ncol} cells ({sub_cell_size}m)\n"
               f"Total cells: {expected_nrow * expected_ncol:,}\n"
               f"2-year travel: ~{travel_distance:.0f}m")
ax.text(0.02, 0.02, domain_info, transform=ax.transAxes, fontsize=10,
        verticalalignment='bottom', bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.9))

# Create legend
legend_handles = [
    mpatches.Patch(facecolor='orange', alpha=0.3, edgecolor='red', linewidth=2, 
                   label='Transport Submodel Domain'),
    mlines.Line2D([], [], marker='*', color='red', markeredgecolor='darkred', 
                  markeredgewidth=2, markersize=15, linestyle='None', label='Contamination Source'),
    mlines.Line2D([], [], marker='o', color='blue', markeredgecolor='white', 
                  markeredgewidth=2, markersize=10, linestyle='None', label='Wells')
]

ax.set_title(f'Transport Submodel Domain - Group {group_number}\n'
             f'Grid-Aligned Domain for 2-Year Plume Migration Analysis',
             fontsize=14, fontweight='bold')
ax.set_xlabel('X Coordinate (m)', fontsize=12)
ax.set_ylabel('Y Coordinate (m)', fontsize=12)
ax.legend(handles=legend_handles, loc='upper right', fontsize=11)
ax.set_aspect('equal')

plt.tight_layout()
plt.show()

# Print summary
print("\n" + "="*60)
print("SUBMODEL DOMAIN SUMMARY")
print("="*60)
print(f"Purpose: 2-year TCE transport simulation")
print(f"Domain size: {submodel_xmax - submodel_xmin:.0f} × {submodel_ymax - submodel_ymin:.0f} m")
print(f"Grid: {expected_nrow} rows × {expected_ncol} cols at {sub_cell_size}m resolution")
print(f"Total cells: {expected_nrow * expected_ncol:,}")
print(f"Estimated 2-year plume travel: ~{travel_distance:.0f} m")
print(f"Buffer adequacy: {'✓ Adequate' if buffer_north_m >= buffer_min else '⚠ May be insufficient'}")
print("="*60)

---
## 7. Create Telescope Submodel for Flow

### Why Telescope?

Transport modeling requires finer grid resolution than flow modeling because:
- Need to resolve source area (sharp concentration gradients)
- Peclet number constraint: Δx ≤ 2·αL (for αL=10m → Δx ≤ 20m)
- Parent model cells (~50-50m) are too coarse for accurate transport

The telescope approach:
1. Uses coarse parent model for regional flow (efficient)
2. Creates refined submodel only where needed (5m cells)
3. Extracts boundary conditions from parent model
4. Runs transport on refined grid (accurate)

We'll apply the exact same grid generation workflow from notebook 4, using our submodel boundary polygon as the domain definition. This ensures consistency with the established methodology.

### Create Refined Grid

Generate a refined grid for the submodel domain with 5m cell spacing.

#### Rotate the submodel boundary polygon for regular grid alignment

In [None]:
# Buffer the model boundary gdf
submodel_boundary_gdf['geometry'] = submodel_boundary_gdf['geometry'].buffer(10)

# Define the rotation angle in degrees
grid_rotation_angle = 30  # degrees, identified by trial and error, you can adjust this angle to minimize the number of cells outside the boundary
origin_rotation = Point(0, 0)  # Origin for rotation, can be adjusted as needed
# Rotate the model boundary polygon
submodel_boundary_gdf_rotated = submodel_boundary_gdf.copy()

submodel_boundary_gdf_rotated['geometry'] = submodel_boundary_gdf_rotated['geometry'].apply(
    lambda geom: rotate(geom, grid_rotation_angle, origin=origin_rotation)
)
# Get the bounding box of the rotated geometry
xmin_rotated, ymin_rotated, xmax_rotated, ymax_rotated = submodel_boundary_gdf_rotated.total_bounds
# Plot the rotated boundary to verify
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
submodel_boundary_gdf_rotated.plot(ax=ax, facecolor='none', edgecolor='blue', linewidth=2)
ax.set_title("Figure 2: Rotated Model Boundary.")
plt.xlabel("X-coordinate")
plt.ylabel("Y-coordinate")
plt.show()

#### Create Sub-model Grid 

In [None]:
# --- 2. Creation of a new Model Grid based on the rotated Model Boundary ---
# We now have new bounding box coordinates for the rotated model boundary. 
# These we need to rotate back to the original coordinate system to create a
# regular grid that fits the rotated boundary.
# We use the rotated bounding box to define the grid dimensions.
# Calculate the new grid dimensions based on the rotated bounding box
width_rotated = xmax_rotated - xmin_rotated
height_rotated = ymax_rotated - ymin_rotated

# Calculate the number of rows and columns based on the rotated bounding box
ncol_rotated = int(np.ceil(width_rotated / sub_cell_size)) - 1 # Based on visual inspection of rotated grid.
nrow_rotated = int(np.ceil(height_rotated / sub_cell_size))

# Compare number of rows and columns with the original grid
print(f"Rotated Grid: {ncol_rotated} columns, {nrow_rotated} rows")

# Define the delr and delc for the rotated grid
delr_rotated = np.full(ncol_rotated, sub_cell_size)
delc_rotated = np.full(nrow_rotated, sub_cell_size)
nlay = parent_modelgrid.nlay

# Plot the rotated grid and the rotated boundary to verify
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
# Create a new StructuredGrid with the rotated dimensions
rotated_grid = StructuredGrid(
    delr=delr_rotated,
    delc=delc_rotated,
    top=np.ones((nrow_rotated, ncol_rotated)) * 100,  # Example top elevation
    botm=np.ones((nlay, nrow_rotated, ncol_rotated)) * 50,  # Example bottom elevation
    xoff=xmin_rotated,  # Use the lower-left of the rotated extent
    yoff=ymin_rotated,  # Use the lower-left of the rotated extent
    angrot=0,  # We are currently in the rotated coordinate system, so no additional rotation is needed
    lenuni=2,  # Length unit code: 2 for meters
    crs=submodel_boundary_gdf_rotated.crs.to_string()  # Automatically get CRS from geopackage
)
pmv = flopy.plot.PlotMapView(modelgrid=rotated_grid, ax=ax)
pc = pmv.plot_array(rotated_grid.top, alpha=0.5, cmap='terrain')
pmv.plot_grid()
ax.set_aspect('equal', adjustable='box') # Ensure correct aspect ratio
ax.set_title("Figure 3: Rotated FloPy Grid with Rotated Boundary.")

#### Rotate the new submodel grid to align with parent grid rotation

In [None]:
# --- 3. Rotation of the new Model Grid in the CH Coordinate System ---
# Now we need to rotate the lower-left corner of the rotated grid back to the 
# original coordinate system.
# The lower-left corner of the rotated bounding box
# Create points from the rotated bounding box coordinates
min_point_rotated = Point(xmin_rotated, ymin_rotated)
max_point_rotated = Point(xmax_rotated, ymax_rotated)

# Apply inverse rotation (negative angle) around the same origin
min_point_original = rotate(min_point_rotated, -grid_rotation_angle, 
                            origin=origin_rotation)
max_point_original = rotate(max_point_rotated, -grid_rotation_angle, 
                            origin=origin_rotation)

# Extract the coordinates
xmin_original = min_point_original.x
ymin_original = min_point_original.y
xmax_original = max_point_original.x
ymax_original = max_point_original.y

print(f"Original coordinates after inverse rotation:")
print(f"xmin: {xmin_original:.2f}, ymin: {ymin_original:.2f}")
print(f"xmax: {xmax_original:.2f}, ymax: {ymax_original:.2f}")

xll = xmin_original
yll = ymin_original

print(f"Corrected grid lower-left corner:")
print(f"xll = {xll:.2f}")
print(f"yll = {yll:.2f}")
print(f"Number of cells in the rotated grid: {nrow_rotated * ncol_rotated * nlay}")

# Create the FloPy structured grid with the rotated bounding box
sub_modelgrid = StructuredGrid(
    delr=delr_rotated,
    delc=delc_rotated,
    xoff=xmin_original,  # Use the lower-left of the rotated extent
    yoff=ymin_original,  # Use the lower-left of the rotated extent
    angrot=-grid_rotation_angle,  # Apply the desired rotation to the grid
    lenuni=2,  # Length unit code: 2 for meters
    crs=submodel_boundary_gdf.crs.to_string()  # Automatically get CRS from geopackage
)

# Update grid polygons, tag active cells (≥50% inside), and get IBOUND
grid_gdf, ibound = grid_utils.build_grid_gdf_and_ibound(
    modelgrid=sub_modelgrid,
    boundary_gdf=submodel_boundary_gdf,        # your boundary GeoDataFrame
    frac_threshold=0.5,      # change if needed
    nlay=nlay                 # use your model's nlay
)
# Count the number of active cells
active_cells = ibound[ibound > 0].sum()
print(f"Total number of active cells in the grid: {active_cells}")

print("Model grid created with the following parameters:")
print(sub_modelgrid)

# Plot the rotated grid and the model_boundary to check alignment
fig, ax = plt.subplots(1, 1, figsize=(12, 12))
pmv = flopy.plot.PlotMapView(modelgrid=sub_modelgrid, ax=ax)
pmv.plot_grid() 
submodel_boundary_gdf.plot(ax=ax, facecolor='none', edgecolor='red', linewidth=2)
red_line = mlines.Line2D([], [], color='red', linewidth=2, label='Model Boundary')
ax.legend(handles=[red_line], loc='upper right')
ax.set_title("Figure 4: Correctly Rotated Grid with Model Boundaries")
plt.xlabel("X-coordinate")
plt.ylabel("Y-coordinate")
plt.show()
ax.set_aspect('equal', adjustable='box') # Ensure correct aspect ratio

### Interpolate Aquifer Properties

Interpolate hydraulic conductivity, layer elevations, and other properties from parent to refined grid.


In [None]:
# TODO: Interpolate from parent to submodel:
# - Hydraulic conductivity (K)
# - Layer top and bottom elevations
# - Specific storage (if transient)
# Visualize interpolated K field

#### Model Top

In [None]:
# Load and resample DEM to submodel grid (following notebook 4)
dem_path = download_named_file('dem_hres', data_type='gis')
rio = flopy.utils.Raster.load(dem_path)

print(f"DEM loaded:")
print(f"  CRS: {rio.crs}")
print(f"  Bounds: {rio.bounds}")

# Resample DEM to submodel grid
print("Resampling DEM to submodel grid...")
import time
t0 = time.time()
submodel_top = rio.resample_to_grid(sub_modelgrid, band=rio.bands[0], method="nearest")
resample_time = time.time() - t0

# Clean up the resampled data
submodel_top = np.round(submodel_top, 1)  # Round to 10 cm
valid = np.isfinite(submodel_top) & (submodel_top > 0)

if not np.any(valid):
    raise RuntimeError("No valid DEM data found in submodel area")

print(f"DEM resampling completed in {resample_time:.2f} seconds")
print(f"  Elevation range: {submodel_top[valid].min():.1f} to {submodel_top[valid].max():.1f} m")

# Plot the submodel_top on the submodel grid
fig, ax = plt.subplots(figsize=(10, 10))
pmv = flopy.plot.PlotMapView(modelgrid=sub_modelgrid, ax=ax)
im = pmv.plot_array(submodel_top, cmap='terrain', vmin=np.nanmin(submodel_top), vmax=np.nanmax(submodel_top))
pmv.plot_grid(color='grey', alpha=0.2)
cbar = plt.colorbar(im, ax=ax, shrink=0.5)
cbar.set_label('Elevation (m a.s.l.)')
ax.set_title('Resampled DEM on Submodel Grid')
ax.set_aspect('equal', adjustable='box') # Ensure correct aspect ratio
plt.xlabel("X-coordinate")
plt.ylabel("Y-coordinate")
plt.show()

#### Model Bottom

In [None]:
# Define submodel bottom based on groundwater levels and aquifer thickness
# Load groundwater levels from file & interpolate to submodel grid
isolines = download_named_file('groundwater_map_norm', data_type='gis')
gdf_isolines = gpd.read_file(isolines, layer='GS_GW_ISOHYPSE_MW_L')
gw_elevations = grid_utils.interpolate_isohypses_to_grid(gdf_isolines, sub_modelgrid)

'''# Optional, plot gw_elevations for verification
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
im = ax.imshow(gw_elevations, extent=sub_modelgrid.extent, origin='upper', cmap='Blues')
ax.set_title("Figure 5: Interpolated Groundwater Elevations on Submodel Grid")
ax.set_xlabel("X-coordinate")
ax.set_ylabel("Y-coordinate")
plt.colorbar(im, ax=ax, label="Groundwater Elevation (m a.s.l.)")
ax.set_aspect('equal', adjustable='box')
plt.show()'''

# Load groundwater thickness from file, requires 4_model_implementation.ipynb to have been run once first
workspace = os.path.join(get_default_data_folder(), 'limmat_valley_model')
thickness_path = os.path.join(workspace, 'aquifer_thickness_contours.gpkg')
aquifer_thickness_gdf = gpd.read_file(thickness_path, layer='aquifer_thickness_contours')
# Interpolate aquifer thickness to submodel grid
aquifer_thickness_resampled = grid_utils.interpolate_aquifer_thickness_to_grid_with_contour_densification(
    contour_gdf=aquifer_thickness_gdf,
    modelgrid=sub_modelgrid,
    thickness_column='aquifer_thickness',
    contour_interval=2.0,  # Create intermediate contours every 2m
    plot_intermediate=False,  # Show the contour densification step
    plot_points=False,  # Set to True if you want to see final interpolation points
    buffer_distance=300
)
# Smooth the resampled aquifer thickness to remove small-scale noise
from scipy.ndimage import gaussian_filter
aquifer_thickness_resampled = gaussian_filter(aquifer_thickness_resampled, sigma=4)

'''# Plot aquifer thickness for verification
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
im = ax.imshow(aquifer_thickness_resampled, extent=sub_modelgrid.extent, origin='upper', cmap='YlOrBr')
ax.set_title("Figure 6: Interpolated Aquifer Thickness on Submodel Grid")
ax.set_xlabel("X-coordinate")
ax.set_ylabel("Y-coordinate")
plt.colorbar(im, ax=ax, label="Aquifer Thickness (m)")
ax.set_aspect('equal', adjustable='box')
plt.show()'''

# Calculate bottom elevation
submodel_bottom = gw_elevations - aquifer_thickness_resampled

# Ensure bottom is 3D array format
if submodel_bottom.ndim == 2:
    submodel_bottom = submodel_bottom[np.newaxis, :, :]

print(f"Submodel bottom calculated:")
print(f"  Bottom range: {submodel_bottom[0][valid].min():.1f} to {submodel_bottom[0][valid].max():.1f} m")

# Define the delr and delc for the submodel grid
delr = np.full(sub_modelgrid.ncol, sub_cell_size)
delc = np.full(sub_modelgrid.nrow, sub_cell_size)

# Update the submodel grid with real elevations
submodel_grid = StructuredGrid(
    delr=delr,
    delc=delc,
    top=submodel_top,
    botm=submodel_bottom,
    nlay=nlay,
    xoff=xll,
    yoff=yll,
    angrot=-grid_rotation_angle
)

print("Submodel grid updated with DEM elevations")

# Plot the final submodel grid with bottom elevations
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
pmv = flopy.plot.PlotMapView(modelgrid=submodel_grid, ax=ax)
im = pmv.plot_array(submodel_grid.botm, cmap='terrain', vmin=np.nanmin(submodel_grid.botm), vmax=np.nanmax(submodel_grid.botm))
pmv.plot_grid(color='grey', alpha=0.2)
cbar = plt.colorbar(im, ax=ax, shrink=0.5)
cbar.set_label('Elevation (m a.s.l.)')
ax.set_title('Submodel Bottom Elevations')
ax.set_aspect('equal', adjustable='box') # Ensure correct aspect ratio
plt.xlabel("X-coordinate")
plt.ylabel("Y-coordinate")
plt.show()

#### Sub-model DIS package

In [None]:
# Define working directory for submodel
# Create a dedicated output workspace for the transport submodel
transport_base_ws = os.path.expanduser(cfg['output']['workspace'])
# Add group number to path and add sub-directory for the sub_base model
sub_base_ws = transport_base_ws + str(group_number) + "/sub_base"
case_utils.ensure_dir(sub_base_ws)

print(f"Transport submodel workspace: {sub_base_ws}")

# Create the sub_base model
m_sub_base = flopy.modflow.Modflow(
    parent_base_namefile.replace('.nam', ''), 
    model_ws=sub_base_ws,
    version='mfnwt',
    exe_name='mfnwt.exe'  # Ensure the executable is correctly specified
)

# Get temporal parameters from parent model
nper = m_parent_base.dis.nper
perlen = m_parent_base.dis.perlen.array
nstp = m_parent_base.dis.nstp.array
tsmult = m_parent_base.dis.tsmult.array # Time step multiplier
steady = m_parent_base.dis.steady.array

sub_base_dis = flopy.modflow.ModflowDis(
    model=m_sub_base,
    model_ws=sub_base_ws,
    nlay=nlay,
    nrow=sub_modelgrid.nrow,
    ncol=sub_modelgrid.ncol,
    delr=delr,
    delc=delc,
    xul=xll,
    yul=yll + (sub_modelgrid.nrow * sub_cell_size),  # Upper-left y-coordinate
    angrot=-grid_rotation_angle,
    crs=sub_modelgrid.crs.to_string(),
    top=submodel_top,
    botm=submodel_bottom,
    nper=nper,
    perlen=perlen,
    nstp=nstp,
    tsmult=tsmult,
    steady=steady,
    itmuni=4,  # Time unit: days
    lenuni=2,  # Length unit: meters
)

# Plot sub_base_grid for verification
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
pmv = flopy.plot.PlotMapView(model=m_sub_base, ax=ax)
pmv.plot_grid(color='grey', alpha=0.2)
ax.set_title("Submodel Grid Verification")
plt.xlabel("X-coordinate")
plt.ylabel("Y-coordinate")
ax.set_aspect('equal', adjustable='box') # Ensure correct aspect ratio
plt.show()

#### UPW package

In [None]:
# Get the UPW package from parent model
parent_upw = m_parent_base.upw
parent_ibound = m_parent_base.bas6.ibound.array  # Get parent model IBOUND

# Extract hydraulic conductivity arrays from parent model
parent_hk = parent_upw.hk.array  # Horizontal hydraulic conductivity
parent_vka = parent_upw.vka.array  # Vertical hydraulic conductivity (or anisotropy ratio)
parent_sy = parent_upw.sy.array  # Specific yield
parent_ss = parent_upw.ss.array  # Specific storage

print(f"Parent model aquifer parameters:")
print(f"  HK range: {parent_hk.min():.2f} to {parent_hk.max():.2f} m/day")
print(f"  VKA range: {parent_vka.min():.6f} to {parent_vka.max():.6f}")# Get the UPW package from parent model
parent_upw = m_parent_base.upw
parent_ibound = m_parent_base.bas6.ibound.array  # Get parent model IBOUND

# Extract hydraulic conductivity arrays from parent model
parent_hk = parent_upw.hk.array  # Horizontal hydraulic conductivity
parent_vka = parent_upw.vka.array  # Vertical hydraulic conductivity (or anisotropy ratio)

print(f"Parent model aquifer parameters:")
print(f"  HK range: {parent_hk.min():.2f} to {parent_hk.max():.2f} m/day")
print(f"  VKA range: {parent_vka.min():.6f} to {parent_vka.max():.6f}")

# Check parent model active cells
if parent_ibound.ndim == 3:
    parent_active_cells = (parent_ibound[0, :, :] == 1).sum()
    active_mask_parent = parent_ibound[0, :, :] == 1
    print(f"  Active cells in parent model: {parent_active_cells:,}")
else:
    parent_active_cells = (parent_ibound == 1).sum()
    active_mask_parent = parent_ibound == 1
    print(f"  Active cells in parent model: {parent_active_cells:,}")

# Use representative uniform values from active parent cells instead of interpolation
print("\nUsing uniform parameter values from active parent model cells...")
print(f"Submodel grid shape: {sub_modelgrid.nrow} x {sub_modelgrid.ncol}")

# Extract statistics from active cells only (exclude zeros and inactive cells)
active_hk = parent_hk[0][active_mask_parent]
active_vka = parent_vka[0][active_mask_parent]

# Filter out zeros and get representative values
valid_hk = active_hk[active_hk > 0]
valid_vka = active_vka[active_vka > 0]

# Use median values for uniform parameters (more robust than mean)
uniform_hk = np.median(valid_hk) if len(valid_hk) > 0 else 20.0  # Default for gravel aquifer
uniform_vka = np.median(valid_vka) if len(valid_vka) > 0 else 2.0  # Default VKA

print(f"\nRepresentative uniform values from active cells:")
print(f"  Uniform HK: {uniform_hk:.2f} m/day (from {len(valid_hk)} active cells)")
print(f"  Uniform VKA: {uniform_vka:.6f} (from {len(valid_vka)} active cells)")

# Create uniform parameter arrays for submodel
sub_hk = np.full((nlay, sub_modelgrid.nrow, sub_modelgrid.ncol), uniform_hk)
sub_vka = np.full((nlay, sub_modelgrid.nrow, sub_modelgrid.ncol), uniform_vka)

# Ensure all values are physically realistic (positive)
sub_hk = np.maximum(sub_hk, 0.1)  # Minimum 0.1 m/day
sub_vka = np.maximum(sub_vka, 0.001)  # Minimum VKA

print(f"\nSubmodel aquifer parameters (uniform values, all >0):")
print(f"  HK: {sub_hk.min():.2f} to {sub_hk.max():.2f} m/day (uniform)")
print(f"  VKA: {sub_vka.min():.6f} to {sub_vka.max():.6f} (uniform)")

# Verify array dimensions match submodel grid
print(f"\nArray dimension verification:")
print(f"  sub_hk shape: {sub_hk.shape}")
print(f"  Expected shape: ({nlay}, {sub_modelgrid.nrow}, {sub_modelgrid.ncol})")

# Create arrays with submodel dimensions for parent parameters
# For parameters that don't need interpolation, create uniform arrays
sub_laytyp = np.ones(nlay, dtype=int) * parent_upw.laytyp.array[0]  # Use first layer value

# Extract scalar value from parent hani array - it might be 2D or 3D
if parent_upw.hani.array.ndim == 3:
    hani_value = parent_upw.hani.array[0, 0, 0]  # 3D array: [layer, row, col]
elif parent_upw.hani.array.ndim == 2:
    hani_value = parent_upw.hani.array[0, 0]     # 2D array: [row, col]
else:
    hani_value = parent_upw.hani.array[0]        # 1D array: [layer]

# Ensure hani_value is reasonable (>0)
if hani_value <= 0:
    hani_value = 1.0  # Default isotropic
    print(f"  Warning: Parent HANI ≤0, using default value: {hani_value}")

print(f"  hani_value used: {hani_value}")

# Create UPW package for submodel with uniform arrays
sub_base_upw = flopy.modflow.ModflowUpw(
    m_sub_base,
    laytyp=sub_laytyp,
    hk=sub_hk,
    hani=hani_value,  # Use scalar value
    vka=sub_vka,
    ipakcb=53  # Save cell-by-cell budget
)

# Visualize uniform hydraulic conductivity
fig, axes = plt.subplots(1, 2, figsize=(16, 8))

# Plot HK
im1 = axes[0].imshow(sub_hk[0], extent=sub_modelgrid.extent, origin='upper', cmap='viridis')
axes[0].set_title('Hydraulic Conductivity (HK)\n[m/day]')
plt.colorbar(im1, ax=axes[0], shrink=0.7)

# Plot VKA
im2 = axes[1].imshow(sub_vka[0], extent=sub_modelgrid.extent, origin='upper', cmap='plasma')
axes[1].set_title('Vertical Hydraulic Conductivity (VKA)\n[m/day or ratio]')
plt.colorbar(im2, ax=axes[1], shrink=0.7)

# Format axes
for ax in axes.flat:
    ax.set_xlabel('X Coordinate (m)')
    ax.set_ylabel('Y Coordinate (m)')
    ax.set_aspect('equal')

plt.suptitle('Uniform Aquifer Parameters on Submodel Grid', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print("✓ UPW package created with uniform parameters from parent model statistics")
print("✓ All hydraulic conductivity values are >0 and physically realistic")

### Extract Boundary Conditions

In [None]:
# TODO: Extract parent model heads at submodel boundaries
# Create CHD or GHB package for submodel
# Extract river cells within submodel (if any)
# Extract recharge (if applicable)

#### BAS package

Extract heads from parent model along submodel boundaries to use as boundary conditions.

In [None]:
# Extract parent model heads for boundary interpolation
print("Extracting parent model heads for boundary condition interpolation...")

# Load parent model heads
parent_hds_path = os.path.join(parent_workspace, f"{m_parent_base.name}.hds")
if not os.path.exists(parent_hds_path):
    print("Parent model heads file not found. Running parent model...")
    success, buff = m_parent_base.run_model(silent=True, report=True)
    if not success:
        raise RuntimeError("Parent model failed to run")
    print("✓ Parent model run completed")

# Load parent heads
headobj_parent = flopy.utils.HeadFile(parent_hds_path)
parent_heads = headobj_parent.get_data()[0]  # Layer 0, stress period 0

# Get parent model grid coordinates and active cells
parent_ibound = m_parent_base.bas6.ibound.array[0]  # Layer 0
parent_active_mask = parent_ibound == 1

# Extract coordinates of active parent cells
parent_x_centers = parent_modelgrid.xcellcenters[parent_active_mask]
parent_y_centers = parent_modelgrid.ycellcenters[parent_active_mask]
parent_heads_active = parent_heads[parent_active_mask]

# Filter out any invalid heads (NaN, inf, or unrealistic values)
valid_head_mask = (np.isfinite(parent_heads_active) & 
                   (parent_heads_active > 300) &  # Reasonable lower bound for elevation
                   (parent_heads_active < 600))   # Reasonable upper bound for elevation

parent_coords_valid = np.column_stack([
    parent_x_centers[valid_head_mask],
    parent_y_centers[valid_head_mask]
])
parent_heads_valid = parent_heads_active[valid_head_mask]

print(f"Parent model head extraction:")
print(f"  Total parent cells: {parent_heads.size:,}")
print(f"  Active parent cells: {np.sum(parent_active_mask):,}")
print(f"  Valid heads for interpolation: {len(parent_heads_valid):,}")
print(f"  Head range: {parent_heads_valid.min():.1f} to {parent_heads_valid.max():.1f} m")

# Create KDTree for efficient nearest neighbor search
from scipy.spatial import cKDTree
parent_tree = cKDTree(parent_coords_valid)

print("✓ Parent model heads prepared for boundary interpolation")

In [None]:
# Create complete CHD boundary
def create_complete_boundary_ibound(submodel_grid, boundary_polygon, boundary_thickness=1):
    """
    Create IBOUND array with complete CHD boundary around submodel domain.
    
    Parameters:
    -----------
    submodel_grid : flopy.discretization.StructuredGrid
        Submodel grid
    boundary_polygon : shapely.geometry.Polygon
        Clipped boundary polygon
    boundary_thickness : int
        Number of cell layers to mark as CHD from domain edge
    
    Returns:
    --------
    ibound : numpy.ndarray
        IBOUND array with CHD boundaries
    boundary_cells : list
        List of boundary cell information
    """
    from shapely.geometry import Point
    
    # Initialize IBOUND as all active cells
    ibound = np.ones((nlay, submodel_grid.nrow, submodel_grid.ncol), dtype=int)
    boundary_cells = []
    
    # Method 1: Mark cells outside or on boundary of clipped polygon as CHD
    for i in range(submodel_grid.nrow):
        for j in range(submodel_grid.ncol):
            # Get cell center coordinates
            x_center = submodel_grid.xcellcenters[i, j]
            y_center = submodel_grid.ycellcenters[i, j]
            cell_point = Point(x_center, y_center)
            
            # Check if cell center is outside boundary or very close to boundary
            distance_to_boundary = cell_point.distance(boundary_polygon.boundary)
            
            # Mark as CHD if:
            # 1. Cell is outside the boundary polygon, OR
            # 2. Cell is very close to boundary (within half cell size)
            if (not boundary_polygon.contains(cell_point) or 
                distance_to_boundary < sub_cell_size * 0.5):
                
                # Only mark as CHD if cell has valid coordinates (not completely outside model domain)
                if (submodel_grid.extent[0] <= x_center <= submodel_grid.extent[1] and
                    submodel_grid.extent[2] <= y_center <= submodel_grid.extent[3]):
                    
                    ibound[0, i, j] = -1  # CHD cell
                    boundary_cells.append({
                        'submodel_row': i,
                        'submodel_col': j,
                        'x': x_center,
                        'y': y_center,
                        'distance_to_boundary': distance_to_boundary
                    })
    
    # Method 2: Ensure edge cells are CHD (safety measure)
    edge_thickness = max(1, boundary_thickness)
    
    # Top and bottom edges
    ibound[:, :edge_thickness, :] = -1
    ibound[:, -edge_thickness:, :] = -1
    
    # Left and right edges  
    ibound[:, :, :edge_thickness] = -1
    ibound[:, :, -edge_thickness:] = -1
    
    # Add edge cells to boundary_cells list if not already included
    for i in range(submodel_grid.nrow):
        for j in range(submodel_grid.ncol):
            if ibound[0, i, j] == -1:
                x_center = submodel_grid.xcellcenters[i, j]
                y_center = submodel_grid.ycellcenters[i, j]
                
                # Check if this cell is already in boundary_cells
                cell_exists = any(
                    cell['submodel_row'] == i and cell['submodel_col'] == j 
                    for cell in boundary_cells
                )
                
                if not cell_exists:
                    boundary_cells.append({
                        'submodel_row': i,
                        'submodel_col': j,
                        'x': x_center,
                        'y': y_center,
                        'distance_to_boundary': 0.0  # Edge cell
                    })
    
    return ibound, boundary_cells

# Detect the complete boundary and create IBOUND
print("Creating complete CHD boundary around submodel domain...")
submodel_ibound_complete, boundary_cells_complete = create_complete_boundary_ibound(
    submodel_grid, clipped_submodel_boundary, boundary_thickness=1
)

print(f"Complete boundary detection results:")
print(f"  Total boundary cells: {len(boundary_cells_complete)}")
print(f"  CHD cells: {np.sum(submodel_ibound_complete == -1):,}")
print(f"  Active cells: {np.sum(submodel_ibound_complete == 1):,}")

# Head interpolation for all boundary cells
print("Interpolating heads for all boundary cells...")

# Use the same KDTree approach but for all boundary cells
sub_boundary_coords_complete = []
for cell in boundary_cells_complete:
    sub_boundary_coords_complete.append([cell['x'], cell['y']])

sub_boundary_coords_complete = np.array(sub_boundary_coords_complete)

# Perform inverse distance weighted interpolation
k_neighbors = min(5, len(parent_heads_valid))
distances, neighbor_indices = parent_tree.query(sub_boundary_coords_complete, k=k_neighbors)

# Handle zero distances
distances = np.maximum(distances, 1e-10)

# Calculate weights
weights = 1.0 / distances
weights = weights / weights.sum(axis=1, keepdims=True)

# Weighted average
interpolated_heads_complete = np.sum(
    parent_heads_valid[neighbor_indices] * weights, axis=1
)

# Create CHD package data for all boundary cells
chd_data_complete = []
for i, cell in enumerate(boundary_cells_complete):
    interpolated_head = float(interpolated_heads_complete[i])
    
    chd_data_complete.append([
        0,  # Layer 0
        cell['submodel_row'],
        cell['submodel_col'],
        interpolated_head,
        interpolated_head
    ])

print(f"Complete CHD data created: {len(chd_data_complete)} cells")

# Update variables for consistency with rest of notebook
submodel_ibound_clipped = submodel_ibound_complete
boundary_cells = boundary_cells_complete
submodel_chd_data = chd_data_complete

# Visualize the complete boundary conditions
fig, ax = plt.subplots(figsize=(12, 10))

# Define colormap for IBOUND visualization  
import matplotlib.colors as mcolors
cmap = mcolors.ListedColormap(['blue', 'white'])  # CHD=-1: blue, Active=1: white
bounds = [-1.5, -0.5, 1.5]
norm = mcolors.BoundaryNorm(bounds, cmap.N)

pmv = flopy.plot.PlotMapView(modelgrid=submodel_grid, ax=ax)
im = pmv.plot_array(submodel_ibound_complete[0], cmap=cmap, norm=norm)
pmv.plot_grid(color='gray', alpha=0.3, linewidth=0.3)

# Plot original clipped boundary for reference
if hasattr(clipped_submodel_boundary, 'exterior'):
    boundary_x, boundary_y = clipped_submodel_boundary.exterior.xy
    ax.plot(boundary_x, boundary_y, 'red', linewidth=2, label='Original Boundary')

# Plot wells
wells_gdf.plot(ax=ax, color='red', markersize=80, label='Wells', zorder=5,
               edgecolors='white', linewidth=1)

# Colorbar
cbar = plt.colorbar(im, ax=ax, shrink=0.3, ticks=[-1, 1])
cbar.ax.set_yticklabels(["CHD (-1)", "Active (1)"])
cbar.set_label("IBOUND")

ax.set_title('Complete Submodel Boundary Conditions\n(Blue: CHD boundary, White: Active cells)')
ax.legend()
ax.set_aspect('equal')
plt.tight_layout()
plt.show()

print("✓ Complete CHD boundary created around entire submodel domain")

In [None]:
sub_base_chd = flopy.modflow.ModflowChd(
    model=m_sub_base,
    stress_period_data={0: submodel_chd_data},
    ipakcb=53,
    model_ws=sub_base_ws
)

# Extract CHD package data and visualize with continuous colormap
fig, ax = plt.subplots(figsize=(12, 10))

pmv = flopy.plot.PlotMapView(model=m_sub_base, ax=ax)
pmv.plot_grid(color='gray', alpha=0.3, linewidth=0.3)

# Extract CHD data
chd_package = m_sub_base.chd
chd_data = chd_package.stress_period_data[0]  # First stress period

# Create arrays to hold CHD head values for plotting
chd_array = np.full((m_sub_base.nrow, m_sub_base.ncol), np.nan)
chd_coords_x = []
chd_coords_y = []
chd_heads = []

for chd_cell in chd_data:
    layer, row, col, start_head, end_head = chd_cell
    chd_array[row, col] = start_head
    
    # Also collect coordinates for scatter plot option
    x = m_sub_base.modelgrid.xcellcenters[row, col]
    y = m_sub_base.modelgrid.ycellcenters[row, col]
    chd_coords_x.append(x)
    chd_coords_y.append(y)
    chd_heads.append(start_head)

# Method 1: Plot CHD as array (shows cells as squares)
chd_masked = np.ma.masked_where(np.isnan(chd_array), chd_array)
im = pmv.plot_array(chd_masked, alpha=0.8, cmap='viridis')

# Method 2: Alternative - plot as scatter points (optional, comment out if using array method)
# scatter = ax.scatter(chd_coords_x, chd_coords_y, 
#                     c=chd_heads, 
#                     s=30,  # Size of points
#                     cmap='viridis',
#                     edgecolors='white',
#                     linewidth=0.5,
#                     alpha=0.8)

# Add colorbar
cbar = plt.colorbar(im, ax=ax, shrink=0.7)
cbar.set_label('CHD Head (m a.s.l.)', fontsize=12)

# Add contours of CHD heads for better visualization
if len(chd_heads) > 3:  # Need at least a few points for contouring
    from scipy.interpolate import griddata
    
    # Create a regular grid for interpolation
    xi = np.linspace(ax.get_xlim()[0], ax.get_xlim()[1], 50)
    yi = np.linspace(ax.get_ylim()[0], ax.get_ylim()[1], 50)
    xi_grid, yi_grid = np.meshgrid(xi, yi)
    
    # Interpolate CHD values
    zi_grid = griddata((chd_coords_x, chd_coords_y), chd_heads, 
                       (xi_grid, yi_grid), method='linear')
    
    # Add contour lines
    contours = ax.contour(xi_grid, yi_grid, zi_grid, 
                         levels=8, colors='white', linewidths=1, alpha=0.8)
    ax.clabel(contours, inline=True, fontsize=9, fmt='%.1f m')

ax.set_title('CHD Package - Specified Heads with Continuous Colormap')
ax.set_xlabel('X Coordinate (m)')
ax.set_ylabel('Y Coordinate (m)')
ax.set_aspect('equal')
plt.tight_layout()
plt.show()

# Print CHD statistics
print(f"CHD Package Summary:")
print(f"  Number of CHD cells: {len(chd_data)}")
print(f"  Head range: {min(chd_heads):.2f} to {max(chd_heads):.2f} m")
print(f"  Mean head: {np.mean(chd_heads):.2f} m")
print(f"  Standard deviation: {np.std(chd_heads):.2f} m")

sub_base_bas = flopy.modflow.ModflowBas(m_sub_base, ibound=submodel_ibound_clipped, strt=gw_elevations)

# Plot IBOUND and starting heads
fig, ax = plt.subplots(1, 1, figsize=(16, 8))
pmv = flopy.plot.PlotMapView(model=m_sub_base, ax=ax)
im = pmv.plot_ibound()
plt.colorbar(im, ax=ax, shrink=0.7, ticks=[-1, 0, 1])
pmv.plot_grid(color='gray', alpha=0.3, linewidth=0.3)
ax.set_title('IBOUND Array')
plt.xlabel("X-coordinate")
plt.ylabel("Y-coordinate")
ax.set_aspect('equal')  

#### RECH package

In [None]:
# Extract recharge from parent model
parent_rch = m_parent_base.rch
if parent_rch is not None:
    parent_recharge = parent_rch.rech.array
    
    # Get representative recharge value from parent model
    # Check if recharge is 2D or 3D array
    if parent_recharge.ndim == 3:
        # 3D array: [stress_period, row, col]
        parent_rech_values = parent_recharge[0]  # First stress period
    else:
        # 2D array: [row, col]
        parent_rech_values = parent_recharge
    
    # Get active cells in parent model for statistics
    # Make sure we match the dimensions correctly
    if parent_ibound.ndim == 3:
        parent_active_mask = parent_ibound[0] == 1  # Use first layer
        print(f"Parent IBOUND shape: {parent_ibound.shape}, using layer 0")
    else:
        parent_active_mask = parent_ibound == 1
        print(f"Parent IBOUND shape: {parent_ibound.shape}")
    
    print(f"Parent recharge shape: {parent_rech_values.shape}")
    print(f"Parent active mask shape: {parent_active_mask.shape}")
    
    # Ensure dimensions match
    if parent_rech_values.shape != parent_active_mask.shape:
        print(f"Warning: Dimension mismatch between recharge {parent_rech_values.shape} and active mask {parent_active_mask.shape}")
        # If recharge is 1D (single value), expand it to match grid
        if parent_rech_values.ndim == 0 or (parent_rech_values.ndim == 1 and len(parent_rech_values) == 1):
            uniform_recharge = float(parent_rech_values) if parent_rech_values.ndim == 0 else parent_rech_values[0]
            print(f"Using scalar recharge value: {uniform_recharge:.6f} m/day")
        else:
            # Use fallback value
            uniform_recharge = 0.110 / 365.25  # 110 mm/year converted to m/day
            print(f"Dimension mismatch, using default recharge: {uniform_recharge:.6f} m/day")
    else:
        # Calculate representative recharge from active cells
        active_recharge_values = parent_rech_values[parent_active_mask]
        valid_recharge = active_recharge_values[active_recharge_values > 0]
        
        if len(valid_recharge) > 0:
            uniform_recharge = np.median(valid_recharge)
            print(f"Extracted recharge from parent model:")
            print(f"  Active cells with recharge: {len(valid_recharge):,}")
            print(f"  Recharge range: {valid_recharge.min():.6f} to {valid_recharge.max():.6f} m/day")
            print(f"  Median recharge: {uniform_recharge:.6f} m/day ({uniform_recharge*365.25*1000:.1f} mm/year)")
        else:
            # Fallback to typical values for Swiss conditions
            uniform_recharge = 0.110 / 365.25  # 110 mm/year converted to m/day
            print(f"No valid recharge found in parent model, using default: {uniform_recharge:.6f} m/day")
        
else:
    # No recharge package in parent model - use default
    uniform_recharge = 0.110 / 365.25  # m/day (110 mm/year)
    print(f"No RCH package in parent model, using default recharge: {uniform_recharge:.6f} m/day")

# Create uniform recharge array for submodel
sub_recharge_array = np.full((sub_modelgrid.nrow, sub_modelgrid.ncol), uniform_recharge)

# Create RCH package for submodel
sub_base_rch = flopy.modflow.ModflowRch(
    m_sub_base,
    rech=sub_recharge_array,
    nrchop=3  # Apply recharge to highest active cell
)

print(f"\nSubmodel RCH package created:")
print(f"  Uniform recharge rate: {uniform_recharge:.6f} m/day ({uniform_recharge*365.25*1000:.1f} mm/year)")
print(f"  Applied to grid: {sub_modelgrid.nrow} × {sub_modelgrid.ncol} cells")

# Visualize recharge distribution
fig, ax = plt.subplots(figsize=(10, 8))

pmv = flopy.plot.PlotMapView(modelgrid=submodel_grid, ax=ax)
im = pmv.plot_array(sub_recharge_array * 365.25 * 1000, cmap='Blues', alpha=0.7)  # Convert to mm/year for display
pmv.plot_grid(color='gray', alpha=0.3, linewidth=0.3)

# Plot wells for reference
wells_gdf.plot(ax=ax, color='red', markersize=60, label='Wells', zorder=5,
               edgecolors='white', linewidth=1)

cbar = plt.colorbar(im, ax=ax, shrink=0.3)
cbar.set_label('Recharge Rate (mm/year)')

ax.set_title(f'Uniform Recharge on Submodel Grid\n{uniform_recharge*365.25*1000:.1f} mm/year')
ax.set_xlabel('X Coordinate (m)')
ax.set_ylabel('Y Coordinate (m)')
ax.legend()
ax.set_aspect('equal')
plt.tight_layout()
plt.show()

print("✓ RCH package created with uniform recharge from parent model")

#### WEL package
Here, we reuse the well locations and rates loaded from the flow case study.

TODO: Update well rates in the cell below based on actual concessioned rates

In [None]:
# Get the well rates from the Zurich GIS browser (re-use the rates from your 
# groups flow case study)

# TODO: Update well rates based on actual concessioned rates 
well_rates_m3d = 1400 / 1000 * 86400  # 1400 l/s converted to m3/day
round(well_rates_m3d)

# Map wells to submodel grid cells
from flopy.utils.gridintersect import GridIntersect
from scipy.spatial import cKDTree

# Create GridIntersect object for the submodel
gi = GridIntersect(submodel_grid, method='vertex', rtree=True)

# Also prepare KDTree for fallback nearest-cell lookup
xc_sub = submodel_grid.xcellcenters
yc_sub = submodel_grid.ycellcenters
centers_flat = np.column_stack([xc_sub.ravel(), yc_sub.ravel()])
kdtree = cKDTree(centers_flat)

well_cells = []
for idx, well in wells_gdf.iterrows():
    well_x, well_y = well.geometry.x, well.geometry.y
    
    # Try GridIntersect first
    try:
        result = gi.intersect(Point(well_x, well_y))
        if len(result) > 0:
            # Extract row, col from intersection result
            if hasattr(result, 'iloc'):  # DataFrame
                row = int(result.iloc[0]['row'])  
                col = int(result.iloc[0]['col'])
            else:  # Other formats
                row = int(result[0]['row'])
                col = int(result[0]['col'])
        else:
            raise ValueError("No intersection found")
    except:
        # Fallback to nearest cell center
        dist, idx_flat = kdtree.query([well_x, well_y])
        row, col = np.unravel_index(idx_flat, xc_sub.shape)
    
    # Check if cell is active (not CHD)
    if submodel_ibound_clipped[0, row, col] == 1:  # Active cell
        well_cells.append({
            'well_idx': idx,
            'well_id': well.get('GWR_ID', f'well_{idx}'),
            'x': well_x,
            'y': well_y,
            'layer': 0,
            'row': row, 
            'col': col,
            'fassart': well.get('FASSART', 'Unknown')
        })
        print(f"Well {well.get('GWR_ID', idx)}: mapped to cell (L0, R{row}, C{col})")
    else:
        print(f"Warning: Well {well.get('GWR_ID', idx)} mapped to CHD cell (L0, R{row}, C{col}) - skipping")

print(f"\nMapped {len(well_cells)} wells to active submodel cells")

# Count wells per FASSART type
from collections import Counter
fassart_counts = Counter(w['fassart'] for w in well_cells)
print(f"\nWells per FASSART type:")
for fassart, count in fassart_counts.items():
    print(f"  {fassart}: {count} wells")

# Define pumping rates based on scenario
# Divide the total concessioned rate by the number of wells for each FASSART
pumping_rates = {}
for fassart, count in fassart_counts.items():
    if 'Entnahme' in fassart:
        # Pumping wells - negative rate, divided by number of wells
        pumping_rates[fassart] = -well_rates_m3d / count
    elif 'Rückgabe' in fassart or 'Sickergalerie' in fassart:
        # Injection/infiltration wells - positive rate, divided by number of wells
        pumping_rates[fassart] = +well_rates_m3d / count
    elif 'Sickergalierie' in fassart:
        # Common misspelling handling
        pumping_rates[fassart] = +well_rates_m3d / count
    else:
        # Default for unknown types
        pumping_rates[fassart] = -100 / count

print(f"\nPumping rates per well (divided by number of wells per FASSART):")
for fassart, rate in pumping_rates.items():
    print(f"  {fassart}: {rate:.1f} m³/day per well")

# Create WEL stress period data
wel_data = []
for well_info in well_cells:
    fassart = well_info['fassart']
    
    # Get the rate for this FASSART type (already divided by number of wells)
    rate = pumping_rates.get(fassart, -100)
    
    wel_data.append([well_info['layer'], well_info['row'], well_info['col'], rate])
    print(f"  {well_info['well_id']} ({fassart}): {rate:.1f} m³/day at (L{well_info['layer']}, R{well_info['row']}, C{well_info['col']})")

print(f"\nWEL package data created with {len(wel_data)} wells")
total_pumping = sum(rate for _, _, _, rate in wel_data if rate < 0)
total_injection = sum(rate for _, _, _, rate in wel_data if rate > 0)
print(f"  Total pumping: {total_pumping:,.0f} m³/day")
print(f"  Total injection: {total_injection:,.0f} m³/day")
print(f"  Net extraction: {total_pumping + total_injection:,.0f} m³/day")

#### Solver & output control

In [None]:
# NWT Solver
nwt = flopy.modflow.ModflowNwt(
    m_sub_base,
    headtol=0.01,      # Head tolerance
    fluxtol=5.0,       # Flux tolerance  
    maxiterout=100,    # Maximum outer iterations
    thickfact=1e-05,   # Thickness factor for dry cells
    linmeth=1,         # Linear solution method (1=GMRES, 2=XMD)
    iprnwt=1,          # Print flag
    ibotav=0,          # Bottom averaging flag
    options='COMPLEX'  # Use complex option for difficult problems
)

# Output Control
oc = flopy.modflow.ModflowOc(
    m_sub_base,
    stress_period_data={(0, 0): ['save head', 'save budget']}
)

### Run Submodel Flow Simulation

Run steady-state flow on the refined submodel.

In [None]:
# Write input files and run the submodel
print("Writing submodel input files...")
m_sub_base.write_input()

# Check model setup
print("\nChecking submodel setup...")
chk = m_sub_base.check(f=None, verbose=False)
if chk.summary_array is not None and len(chk.summary_array) > 0:
    print("Model check warnings found - reviewing...")
    for warning in chk.summary_array:
        print(f"  Warning: {warning}")
else:
    print("Model check passed")

# Run the submodel
print(f"\nRunning refined submodel...")
success, buff = m_sub_base.run_model(silent=False, report=True)

In [None]:
if success: 
    plot_utils.plot_model_results(m_sub_base, sub_base_ws, parent_base_namefile.replace('.nam', ''),
                                  show_wells=False, show_ibound=True)


### Verify Submodel Flow Results

Check that submodel flow field is reasonable and consistent with parent model.

In [None]:
# TODO: Quality checks
# - Mass balance error < 1%
# - Head contours match parent in interior
# - Flow directions reasonable
# - Wells create expected drawdown
# - No dry cells or convergence issues

# Visualize:
# - Head contour map
# - Drawdown from wells
# - Flow vectors

---
## 8. Set Up MT3DMS Transport Model

### Create MT3DMS Model Object

Link MT3DMS to the MODFLOW submodel for coupled transport simulation.

In [None]:
# TODO: Create MT3DMS model
# mt = flopy.mt3d.Mt3dms(modelname='transport', modflowmodel=mf_sub)

### Basic Transport Package (BTN)

Set up the BTN package with porosity, initial conditions, and time stepping.

**Key parameters:**
- `prsity`: Effective porosity (0.25 for sand/gravel)
- `sconc`: Initial concentration (0 everywhere)
- `icbund`: Active transport cells (1=active, -1=constant, 0=inactive)
- `nprs`: Number of times to save concentration output

In [None]:
# TODO: Create BTN package
# Define transport time steps and output times
# Set porosity from config
# Initialize concentration to zero

### Advection Package (ADV)

Configure the advection scheme. Recommended: TVD (Total Variation Diminishing) for accuracy and stability.

**Options for mixelm:**
- `mixelm=0`: MOC (Method of Characteristics) - accurate but slow
- `mixelm=1`: MMOC (Modified MOC) - faster, less accurate
- `mixelm=2`: HMOC (Hybrid MOC) - balance
- `mixelm=6`: TVD - **recommended** (accurate, stable, fast)

In [None]:
# TODO: Create ADV package
# Use TVD scheme (mixelm=6)
# adv = flopy.mt3d.Mt3dAdv(mt, mixelm=6)

### Dispersion Package (DSP)

Set dispersivity values from configuration.

**Typical values:**
- Longitudinal dispersivity (αL): 10 m (scale-dependent, ~10% of travel distance)
- Transverse dispersivity (αT): 1 m (typically αL/10)
- Vertical dispersivity (αV): 0.1 m (typically αL/100)

**Molecular diffusion** is usually negligible compared to mechanical dispersion.

In [None]:
# TODO: Create DSP package
# Load dispersivity values from config
# al = longitudinal dispersivity
# trpt = ratio of transverse to longitudinal (αT/αL)
# trpv = ratio of vertical to longitudinal (αV/αL)
# dsp = flopy.mt3d.Mt3dDsp(mt, al=10.0, trpt=0.1, trpv=0.01)

### Reaction Package (RCT) - If Applicable

For Group 0 (TCE), this is a **conservative tracer** (no sorption or decay), so RCT is not needed.

**For other groups with reactions:**
- **Sorption**: `isothm=1` (linear), `sp1=Kd` (distribution coefficient)
- **Decay**: `rc1=λ` (first-order decay constant, 1/day)

**Note**: Groups 3, 4, 7, 8 will need this package.

In [None]:
# For Group 0: No RCT package needed (conservative tracer)

# For groups with reactions (example):
# rct = flopy.mt3d.Mt3dRct(mt, isothm=1, sp1=Kd, rc1=lambda_decay)

### GCG Solver Package

Configure the Generalized Conjugate Gradient solver for MT3DMS.

In [None]:
# TODO: Create GCG package
# gcg = flopy.mt3d.Mt3dGcg(mt, mxiter=100, iter1=50, cclose=1e-6)

---
## 9. Define Source Term (SSM Package)

### Map Source Location to Grid

Convert the source coordinates (Swiss LV03/LV95) to submodel grid indices (layer, row, column).

In [None]:
# TODO: Load source location from config
# source_easting, source_northing from case_config_transport.yaml
# Convert to grid indices (layer, row, col)
# Verify source is within active model domain

### Visualize Source Location

Plot source location on submodel grid to verify correct placement.

In [None]:
# TODO: Create map showing:
# - Submodel grid
# - Source cell (red star)
# - Wells
# - Head contours
# - Flow direction

### Set Up SSM Package

Define the source term using the Source-Sink Mixing (SSM) package.

**For Group 0 (30-day pulse):**
- Use `itype=-1` (constant concentration cell)
- Active for first 30 days (stress period 0)
- Zero concentration after 30 days (stress period 1+)

**SSM data format:**
```python
ssm_data = {
    0: [(lay, row, col, concentration, itype, 0, 0)],  # stress period 0
    1: [(lay, row, col, 0.0, itype, 0, 0)]             # stress period 1
}
```

In [None]:
# TODO: Create SSM package
# Define stress periods (0-30 days: source active, 30-730 days: source off)
# Set up SSM data dictionary
# ssm = flopy.mt3d.Mt3dSsm(mt, stress_period_data=ssm_data)

---
## 10. Run Transport Simulation

### Write MT3DMS Input Files

Write all MT3DMS package files to disk.

In [None]:
# TODO: Write MT3DMS input files
# mt.write_input()

### Run MT3DMS

Execute the transport simulation. This may take several minutes depending on grid size and time steps.

**Monitor for:**
- Convergence at each time step
- Mass balance errors
- Warnings about negative concentrations (instability)
- Courant/Peclet number violations

In [None]:
# TODO: Run MT3DMS
# success, buff = mt.run_model(silent=False)
# Check for successful completion

### Load Concentration Results

Read the UCN (concentration) file to extract results at different times.

In [None]:
# TODO: Load concentration results
# ucn = flopy.utils.UcnFile('MT3D001.UCN')
# Get times available
# Load concentrations at key times (30d, 90d, 180d, 365d, 730d)

---
## 11. Post-Processing and Visualization

### Concentration Maps at Multiple Times

Create concentration contour maps showing plume evolution over time.

**Suggested times:** 30 days, 90 days, 6 months, 1 year, 2 years

In [None]:
# TODO: Create concentration maps
# For each time:
#   - Plot concentration contours
#   - Overlay wells, source, river
#   - Add scale bar, north arrow
#   - Label max concentration
# Create multi-panel figure showing evolution

### Breakthrough Curves at Monitoring Points

Plot concentration vs. time at key observation points.

In [None]:
# TODO: Extract concentration time series at monitoring points
# Plot C(t) for:
#   - Downgradient monitoring well
#   - Pumping well location
#   - River compliance point
# Add compliance threshold line
# Calculate breakthrough time (when C > threshold)

### Plume Extent Over Time

Calculate the area where concentration exceeds a threshold (e.g., 1 mg/L or 5 mg/L).

In [None]:
# TODO: For each time step:
# - Count cells where C > threshold
# - Calculate total area (n_cells × cell_area)
# - Plot extent vs time
# - Identify time of maximum extent

### Mass Balance Analysis

Track the fate of contaminant mass:
- Mass remaining in aquifer
- Mass exported through boundaries
- Mass captured by pumping wells
- Mass lost to decay (if applicable)

In [None]:
# TODO: Calculate mass balance
# Initial mass = concentration × source volume × duration
# Mass in aquifer at each time = sum(C × cell_volume × porosity)
# Mass balance error from MT3DMS output
# Plot mass components vs time
# Verify mass balance error < 1%

---
## 12. Quality Checks

### Mass Balance Verification

Ensure mass is conserved throughout the simulation (error < 1%).

In [None]:
# TODO: Extract mass balance from MT3DMS output
# Check cumulative mass balance error
# Flag if error > 1%

### Stability Criteria

Check Courant and Peclet numbers to verify numerical stability.

**Courant number (advective stability):**
```
Cr = v·Δt / Δx ≤ 1
```

**Peclet number (dispersive stability):**
```
Pe = v·Δx / D ≤ 2-4
```

Where:
- v = pore water velocity (m/day)
- Δt = time step (days)
- Δx = cell size (m)
- D = dispersion coefficient = αL·v

In [None]:
# TODO: Calculate stability numbers
# Extract velocity from flow model
# Calculate Courant number
# Calculate Peclet number
# Verify both are within acceptable ranges

### Physical Reasonableness Checks

Verify results make physical sense.

In [None]:
# TODO: Check for:
# - No negative concentrations
# - No concentrations > source concentration
# - Plume moves in expected flow direction
# - Maximum concentrations at reasonable locations
# - Plume shape consistent with dispersivity
# - Pumping wells show capture if within influence

---
## 13. Well-Contaminant Interaction Analysis

### Identify Pumping vs Injection Wells

Classify wells based on flow rates (negative = pumping, positive = injection).

In [None]:
# TODO: Separate wells by type
# pumping_wells = wells with Q < 0
# injection_wells = wells with Q > 0
# Calculate total pumping and injection rates

### Pumping Well Capture Analysis

Assess whether pumping wells capture contamination.

**Key questions:**
- Do pumping wells intercept the plume?
- What concentration reaches the wells?
- What percentage of contaminant mass is captured?
- How does pumping affect plume migration?

In [None]:
# TODO: For each pumping well:
# - Extract concentration time series at well location
# - Calculate time of breakthrough
# - Calculate peak concentration
# - Estimate mass captured
# Create capture zone map showing well influence on plume

### Injection Well Spreading Analysis

Assess whether injection wells spread contamination.

**Key questions:**
- Do injection wells push water toward or away from source?
- Could injection accelerate contaminant spreading?
- What is the zone of influence for injection wells?

In [None]:
# TODO: For injection wells:
# - Map flow vectors showing injection influence
# - Compare plume extent with vs without injection (conceptual)
# - Identify if plume is pushed toward sensitive receptors

### Net Effect on Plume Migration

Synthesize the combined effect of all wells on contaminant transport.

In [None]:
# TODO: Create summary visualization:
# - Concentration map with well capture zones
# - Flow vectors showing well influence
# - Plume centerline trajectory
# - Distance to compliance points

# Key findings table:
# - % mass captured by pumping wells
# - Maximum concentration at river
# - Time to reach compliance point
# - Protective vs. spreading effect of well field

---
## 14. OPTIONAL (Bonus Credit): Analytical Verification

### Why Consider Analytical Verification?

Analytical verification demonstrates professional modeling practice and deepens your understanding of when simple vs. complex methods are needed. **This section is optional but highly recommended** for:
- Earning **+5-10% bonus credit** on the assignment
- Understanding the limitations of 1D analytical solutions
- Learning when numerical modeling is truly necessary
- Demonstrating verification best practices

**If you choose to skip this section**, proceed directly to Section 15 (Sensitivity Analysis). Your report should focus on the numerical model results and well-contaminant interactions.

**If you choose to complete this section**, follow the Tier 1 requirements below.

---

### Tier 1 Requirements (Group 0 - Conservative Tracer)

Group 0 (conservative tracer) can perform full 1D Ogata-Banks comparison:
1. Extract 1D transect from MT3DMS results along flow direction
2. Implement 1D Ogata-Banks analytical solution
3. Plot comparison at multiple times
4. Calculate and plot breakthrough curves
5. Discuss discrepancies and their causes

**Time estimate**: 30-60 minutes with provided templates

---

### Extract 1D Transect from MT3DMS

Extract concentration along a transect from source in the direction of flow.

In [None]:
# TODO: Define transect line
# - Start at source
# - Extend in flow direction (use velocity vector)
# - Sample concentrations along transect at multiple times
# - Store distance from source and concentration

### Implement Ogata-Banks Analytical Solution

Calculate 1D analytical solution with same parameters as MT3DMS.

**For continuous source:**
```python
C(x,t) = (C0/2) * [erfc((x-v*t)/(2*sqrt(D*t))) + 
                   exp(v*x/D) * erfc((x+v*t)/(2*sqrt(D*t)))]
```

**For pulse source (Group 0):**
Use superposition: solution at time t minus solution at time (t - t_pulse)

In [None]:
# TODO: Implement Ogata-Banks function
# Input parameters from MT3DMS setup:
#   - v = pore water velocity (K*i/n)
#   - D = dispersion coefficient (alpha_L * v)
#   - C0 = source concentration
#   - t_pulse = 30 days

# Calculate analytical solution for same times as MT3DMS
# Handle pulse source with superposition

### Compare Concentration Profiles

Plot analytical vs numerical concentration along transect at multiple times.

In [None]:
# TODO: Create comparison plots
# For times = [90d, 365d, 1825d, 3650d]:
#   - Plot C(x) analytical (solid line)
#   - Plot C(x) MT3DMS (points)
#   - Add legend, labels, title
# Create multi-panel figure showing evolution

### Compare Breakthrough Curves

Plot C(t) at a fixed location: analytical vs numerical.

In [None]:
# TODO: Select observation distance (e.g., 200m downgradient)
# Calculate analytical C(t) at that distance
# Extract MT3DMS C(t) at nearest cell
# Plot both on same axes
# Calculate:
#   - Time of arrival (when C > 1% of C0)
#   - Peak concentration
#   - Time of peak

### Discuss Discrepancies

Analyze differences between analytical and numerical results.

**Expected differences due to:**
1. **2D/3D spreading**: Numerical model allows transverse dispersion, analytical is 1D only
2. **Grid discretization**: MT3DMS uses discrete cells, analytical is continuous
3. **Numerical dispersion**: Some artificial spreading from numerical scheme
4. **Well influence**: Wells affect flow field in 2D/3D but not in 1D analytical
5. **Boundary effects**: Numerical model has finite boundaries
6. **Heterogeneity**: Numerical model may have varying K, analytical assumes uniform

**Questions to address:**
- Where is agreement best? (near source vs far field, early vs late time)
- Are differences within acceptable uncertainty?
- When is analytical solution "good enough" vs when is numerical needed?
- What aspects cannot be captured by analytical methods?

In [None]:
# TODO: Quantify differences
# Calculate RMSE between analytical and numerical
# Calculate relative error at peak concentration
# Identify times/locations with largest discrepancies
# Create table summarizing agreement metrics

### Conclusions from Analytical Comparison

Synthesize findings and answer: **When is analytical sufficient, and when is numerical modeling necessary?**

**If you completed this optional section, write 2-3 paragraphs discussing:**

1. **Quality of agreement**: How well did analytical match numerical? Where/when was agreement best?

2. **Causes of discrepancies**: Which factors (2D effects, wells, boundaries, etc.) caused the largest differences?

3. **When to use each method**: 
   - When would you recommend analytical solution for this problem?
   - What aspects required numerical modeling?
   - How would you approach a similar problem in the future?

**Bonus credit value**: This section can earn you +5-10% extra credit depending on depth of analysis and quality of discussion.

---
## 15. Sensitivity Analysis

### Parameter Sensitivity

Test how results change with ±50% variation in dispersivity.

**Rationale**: Dispersivity is scale-dependent and uncertain. Understanding sensitivity helps assess prediction reliability.

In [None]:
# TODO: Run MT3DMS with:
# - Base case: αL = 10 m
# - Low case: αL = 5 m (less spreading)
# - High case: αL = 15 m (more spreading)

# Compare:
# - Plume extent at 2 years
# - Breakthrough time at monitoring point
# - Maximum concentration at compliance point

# Plot all three scenarios on same map

### Other Parameter Sensitivities (Optional)

If time permits, test sensitivity to:
- Porosity (affects velocity)
- Source concentration (linear scaling)
- Source duration (mass loading)
- Well pumping rates (capture efficiency)

---
## 16. Summary and Conclusions

### Key Findings

Summarize the main results of your transport modeling study.

**TODO: Summarize (bullet points):**

**Plume Behavior:**
- Maximum plume extent: X m² at Y months/years
- Maximum concentration at monitoring point: X mg/L at Y months/years
- Time to reach river/compliance point: X months/years (or "not reached in 2 years")

**Well Interactions:**
- Pumping wells captured X% of contaminant mass
- [Effect of injection wells on plume spreading]
- [Net protective or spreading effect of well field]

**Analytical Comparison (if completed):**
- Agreement quality: [good/moderate/poor] along flow transect
- Key discrepancies due to: [2D effects, wells, boundaries, etc.]
- Analytical would be sufficient for: [screening-level assessment]
- Numerical was needed for: [well interactions, accurate predictions]

**Sensitivity:**
- Results most sensitive to: [dispersivity, porosity, well rates]
- Uncertainty range on key predictions: ±X%

### Interpretation and Implications

Discuss what the results mean for the scenario.

**TODO: Write 2-3 paragraphs addressing:**

1. **Risk assessment**: Does contamination reach sensitive receptors? Are regulatory thresholds exceeded? What is the timeline for impact?

2. **Well field implications**: How do current pumping/injection operations affect contamination? Should well operations be modified?

3. **Uncertainty and limitations**: What are the main sources of uncertainty? What additional data would reduce uncertainty? What are model limitations?

4. **(If analytical comparison completed)** What insights did the analytical verification provide about when simple vs. complex models are needed?

### Recommendations

Provide actionable recommendations based on your findings.

**TODO: Recommend (bullet points):**

**Monitoring:**
- [Install monitoring wells at specific locations]
- [Sampling frequency and parameters]

**Well Operations:**
- [Adjust pumping rates if needed for capture]
- [Consider turning off/relocating injection if spreading contamination]

**Further Investigation:**
- [Additional site characterization needed]
- [Model refinements or alternative scenarios]

**Remediation (if applicable):**
- [Natural attenuation feasible? Time frame?]
- [Active remediation needed? What approach?]

---
## 17. Professional Report Preparation

### Report Structure (3-4 pages)

Your final deliverable is a professional PDF report with the following structure:

**1. Problem Statement and Objectives (0.5 page)**
- Brief description of TCE spill and well field
- Modeling objectives (key questions)
- Why numerical modeling was needed

**2. Methodology (0.75 page)**
- Transport model setup summary (domain, grid, parameters)
- Key assumptions and justification
- **(If completed)** Brief mention of analytical verification approach

**3. Results (1.5-2 pages, mostly figures)**
- **Figure 1**: Concentration map at 2 years with wells
- **Figure 2**: Breakthrough curve at critical location
- **Figure 3** (optional): Analytical vs numerical comparison (if completed)
- Brief text: Maximum concentrations, breakthrough times, plume extent
- Well-contaminant interaction summary

**4. Discussion and Conclusions (0.75-1 page)**
- Interpretation: What do results mean for scenario?
- **(If completed)** Analytical comparison: When is simple method sufficient?
- Parameter uncertainty: Which parameters matter most?
- Recommendations: Well management, monitoring, remediation

### Report Template

A Word/LaTeX template is provided in the course materials. Use it to ensure consistent formatting and professional appearance.

### Figure Quality Guidelines

- High resolution (at least 300 DPI)
- Clear labels and legends
- Descriptive captions
- Referenced in text
- Consistent color schemes

### Writing Tips

- **Be concise**: Every sentence should add value
- **Be specific**: "100 mg/L" not "high concentration"
- **Past tense**: "The model predicted..." not "The model predicts..."
- **Professional tone**: Technical but accessible
- **Cite sources**: FloPy, MT3DMS, parameter references

### Checklist Before Submission

- [ ] All figures have captions and are referenced in text
- [ ] Key results quantified (not just "the plume moved downgradient")
- [ ] **(If completed)** Analytical comparison included with discussion
- [ ] Recommendations are specific and actionable
- [ ] Report is 3-4 pages (not 10 pages!)
- [ ] Spell-checked and proofread
- [ ] PDF format (not Word doc)
- [ ] File named: `transport_report_group0.pdf`

### Grading Notes

**Base assignment (100%):**
- Technical implementation: 50%
- Professional report: 50%

**Bonus credit (optional):**
- Analytical verification (Section 14): +5-10% depending on quality and depth of analysis
- Total possible score: up to 110%

---
## Acknowledgments and References

### Software
- FloPy: Python package for MODFLOW (Bakker et al., 2016)
- MODFLOW-NWT: Groundwater flow model (Niswonger et al., 2011)
- MT3DMS: Multi-species transport model (Zheng and Wang, 1999)

### References

**TODO: Add references for:**
- TCE properties and parameter values
- Dispersivity scale relationships
- Ogata-Banks analytical solution
- Any other sources used

### Course Materials

This notebook is based on:
- Teaching notebook: `4b_transport_model_implementation.ipynb`
- Transport planning document: `transport_planning.md`
- Support repository: Analytical solution functions and plotting utilities

---

**End of Case Study**