### 🚦 Repository Freshness & Reset (Read Before Running)

This course repository should match the official `course_2025` branch so everyone works from the same, **clean** material. Each time you open this diagnostics notebook, we optionally verify the repository matches the remote branch.

#### JupyterHub vs Local Safety
- On **JupyterHub** (teaching environment), an automatic *hard reset* can be convenient to remove stray/accidental edits.
- On your **local machine**, a hard reset could delete genuine work. So in local environments this notebook will **NOT** perform a destructive reset unless you explicitly opt in.

Environment detection: we treat the session as JupyterHub only if at least one `JUPYTERHUB_*` environment variable is present. Otherwise we assume "local" and fall back to *instructions only*.

#### Protect Your Personal Work
If you have made your own edits inside this teaching repository, a hard reset will ERASE them. To keep personal work (e.g. model experiments, trial scripts, modified notebooks):

1. Create a personal folder outside the teaching repo, e.g. `~/own_model_files/` (or use your home area / a personal git repo).
2. Copy any changed files you care about into that folder *before* running (or forcing) a reset.
3. Only then proceed with the reset step below.
4. Please remember that you will lose access to your JupyterHub instance once the semester is complete. Please copy your important files to a personal location.

#### What the next code cell does
- Locates the repository directory (tries `~/applied_groundwater_modelling.git` then `~/applied_groundwater_modelling`).
- Runs `git fetch origin`.
- Lists uncommitted / untracked changes.
- If local changes exist: it **will NOT reset automatically**; shows a warning + changed files.
- If clean and in JupyterHub: performs a safe hard reset to `origin/course_2025` to guarantee a clean state.
- If NOT in JupyterHub: only prints guidance unless you deliberately enable a local override flag.

> Run the next cell only when you're sure you don't need to keep local edits inside this repository.

---

In [None]:
# Repository sync (lean) – simplified: assumes this notebook runs from project root
# and that SUPPORT_REPO/src exists directly under it.

from importlib import reload
import sys
from pathlib import Path

support_src = Path('SUPPORT_REPO/src').resolve()
if not support_src.exists():
    raise RuntimeError("Expected SUPPORT_REPO/src at project root but it was not found.")
if str(support_src) not in sys.path:
    sys.path.insert(0, str(support_src))

import repo_sync  # type: ignore
reload(repo_sync)

# User-adjustable flags
FORCE_RESET = False          # destructive if True and changes exist
ALLOW_LOCAL_RESET = False    # allow destructive reset off JupyterHub
DRY_RUN = False              # report only
REPO_PATH_OVERRIDE = None    # optional absolute path override
TARGET_REMOTE = 'origin'
TARGET_BRANCH = 'course_2025'

result = repo_sync.sync_repository(
    force_reset=FORCE_RESET,
    allow_local_reset=ALLOW_LOCAL_RESET,
    dry_run=DRY_RUN,
    repo_path_override=REPO_PATH_OVERRIDE,
    target_remote=TARGET_REMOTE,
    target_branch=TARGET_BRANCH,
)
# repo_sync.print_sync_summary(result)
repo_sync_result = result

[ENV] JupyterHub detected: False
[INFO] Using repository: /Users/bea/Documents/GitHub/applied_groundwater_modelling
[STEP] Fetch latest remote refs...
[STEP] Checking status...

    M 0_diagnostics.ipynb
    ?? SUPPORT_REPO/src/repo_sync.py

You have local work in the teaching repository.
If you want to keep it, copy the changed files NOW to a personal directory, e.g.:
    mkdir -p ~/own_model_files
    cp <file_you_care_about> ~/own_model_files/
Then re-run with appropriate flags (force_reset / allow_local_reset) if you intend to discard changes.


=== SYNC SUMMARY ===
repo_path: /Users/bea/Documents/GitHub/applied_groundwater_modelling
jupyterhub: False
changed_files:
    M 0_diagnostics.ipynb
    ?? SUPPORT_REPO/src/repo_sync.py
performed_reset: False
blocked: True
message: Blocked: local changes present and force_reset is False
error: None
[STEP] Checking status...

    M 0_diagnostics.ipynb
    ?? SUPPORT_REPO/src/repo_sync.py

You have local work in the teaching repository.
If yo

# Project Diagnostics Notebook (0_diagnostics)

Run this notebook on JupyterHub to verify your computational environment for the *Applied Groundwater Modelling* course.
It will:

1. Summarize Python & platform info
2. Parse environment YAML (if present) and compile required package list
3. Check imports & report missing packages + versions
4. Optionally (commented) help you install missing packages
5. Verify geospatial stack (GeoPandas / Shapely / Fiona / PyProj / RasterIO / Contextily)
6. Test folium, plotly interactive plotting
7. Detect MODFLOW-2005 executable & run a minimal FloPy model
8. Basic 3D visualization capability check (plotly)
9. (Optional) Memory / performance snapshot (psutil)
10. Produce an aggregated summary at the end

If something fails, scroll to see the first failing diagnostic cell.

---
**Tip:** Re-run the whole notebook after fixing issues to confirm readiness.

In [None]:
# Initialize a global results dictionary to accumulate checks
from __future__ import annotations
diag_results = {
    'python': {},
    'packages': {},
    'geospatial': {},
    'viz': {},
    'modflow': {},
    'system': {}
}
print('Diagnostics result container initialized.')

## 1. Python & Platform Information

In [None]:
import sys, platform, os, datetime, shutil
py_info = {
    'python_version': sys.version.replace('\n', ' '),
    'executable': sys.executable,
    'platform': platform.platform(),
    'processor': platform.processor(),
    'python_build': platform.python_build(),
    'datetime_utc': datetime.datetime.utcnow().isoformat()+'Z'
}
diag_results['python'] = py_info
print('Python/platform info collected:')
for k,v in py_info.items():
    print(f'  {k}: {v}')

## 2. Compile Required Package List
This attempts to parse `environment_students.yml` and `environment_development.yml` if available to build a dependency list.

In [None]:
import re, json
required_packages = set()
yaml_files = ['environment_students.yml','environment_development.yml']
raw_dep_lines = []
for yf in yaml_files:
    if os.path.exists(yf):
        try:
            with open(yf,'r') as f:
                for line in f:
                    ls = line.strip()
                    raw_dep_lines.append(ls)
        except Exception as e:
            print(f'Could not read {yf}: {e}')
# Heuristic: capture package-like tokens (avoid channels, python=, version pins)
pattern = re.compile(r'^[A-Za-z0-9_.-]+(?:==|=)?[A-Za-z0-9_.-]*$')
skip_prefixes = ('python', 'pip', 'anaconda', 'mamba')
for ln in raw_dep_lines:
    if ln.startswith('- '):
        token = ln[2:].strip()
        # Cut off version spec after first = or space
        token = re.split(r'[ =<>]', token)[0]
        if token and pattern.match(token) and not token.lower().startswith(skip_prefixes):
            required_packages.add(token)
# Add core stack explicitly (ensures critical packages are checked)
core = [
 'numpy','pandas','matplotlib','geopandas','shapely','fiona','pyproj','rasterio',
 'contextily','folium','plotly','flopy','scipy','xarray','netCDF4','tqdm','ruamel.yaml',
 'IPython','jinja2','psutil'
 ]
for c in core: required_packages.add(c)
required_packages = sorted(required_packages)
print(f'Total packages to check: {len(required_packages)}')
print(', '.join(required_packages))
diag_results['packages']['required_list'] = required_packages

## 3. Import & Version Check
Attempts to import each required package and record version or error.

In [None]:
import importlib
package_status = {}
missing = []
for pkg in diag_results['packages']['required_list']:
    try:
        mod = importlib.import_module(pkg.replace('-', '_'))
        ver = getattr(mod, '__version__', 'unknown')
        package_status[pkg] = {'ok': True, 'version': ver}
    except Exception as e:
        package_status[pkg] = {'ok': False, 'error': str(e)}
        missing.append(pkg)
diag_results['packages']['status'] = package_status
print(f"Packages OK: {sum(1 for v in package_status.values() if v['ok'])}")
print(f'Packages MISSING/FAILED: {len(missing)}')
if missing:
    print('Missing:', ', '.join(missing))
else:
    print('All required packages imported successfully.')

### (Optional) Install Missing Packages
Uncomment and run the next cell ONLY if you have permission to install packages in this environment.

In [None]:
# Uncomment to attempt installation (may not work on restricted JupyterHub)
# if diag_results['packages'].get('status'):
#     to_install = [p for p,s in diag_results['packages']['status'].items() if not s['ok']]
#     if to_install:
#         import sys, subprocess
#         print('Attempting pip install for:', to_install)
#         subprocess.check_call([sys.executable, '-m', 'pip', 'install', *to_install])
#     else:
#         print('No missing packages to install.')

## 4. Geospatial Stack Smoke Tests

In [None]:
geo_checks = {}
try:
    import geopandas as gpd, shapely.geometry as geom, pyproj
    from shapely.ops import unary_union
    poly1 = geom.box(0,0,1,1)
    poly2 = geom.box(0.5,0.5,1.5,1.5)
    merged = unary_union([poly1, poly2])
    gdf = gpd.GeoDataFrame({'id':[1,2],'geometry':[poly1, poly2]}, crs='EPSG:4326')
    gdf_3857 = gdf.to_crs(3857)
    area_ratio = gdf_3857.area.sum()/gdf.area.sum() if gdf.area.sum() else None
    geo_checks['geopandas'] = True
    geo_checks['shapely_union_ok'] = merged.is_valid
    geo_checks['reproject_area_ratio'] = area_ratio
except Exception as e:
    geo_checks['error'] = str(e)
# Fiona / rasterio presence
for extra in ['fiona','rasterio','contextily']:
    try:
        __import__(extra)
        geo_checks[extra] = True
    except Exception as e:
        geo_checks[extra] = f'ERROR: {e}'
diag_results['geospatial'] = geo_checks
print('Geospatial checks:')
for k,v in geo_checks.items():
    print(f'  {k}: {v}')

## 5. Visualization Library Checks (folium, plotly, matplotlib)

In [None]:
viz_status = {}
# Folium
try:
    import folium
    fmap = folium.Map(location=[47.37, 8.55], zoom_start=10)
    folium.Marker([47.37, 8.55], tooltip='Zurich').add_to(fmap)
    viz_status['folium'] = 'ok (map object created)'
except Exception as e:
    viz_status['folium'] = f'ERROR: {e}'
# Plotly
try:
    import plotly.graph_objects as go
    fig = go.Figure(data=[go.Scatter(x=[0,1], y=[0,1])])
    viz_status['plotly'] = 'ok (scatter figure created)'
except Exception as e:
    viz_status['plotly'] = f'ERROR: {e}'
# Matplotlib
try:
    import matplotlib
    import matplotlib.pyplot as plt
    plt.figure(); plt.plot([0,1],[0,1]); plt.close()
    viz_status['matplotlib'] = 'ok (simple plot created)'
except Exception as e:
    viz_status['matplotlib'] = f'ERROR: {e}'
diag_results['viz'] = viz_status
print('Visualization stack:')
for k,v in viz_status.items():
    print(f'  {k}: {v}')
# Display folium map last (if available)
try:
    fmap
except NameError:
    pass

## 6. MODFLOW-2005 Executable Detection & Minimal FloPy Model
Attempts to locate a MODFLOW-2005 executable and run a 1-layer steady-state test model.

In [None]:
import tempfile, textwrap, math
modflow_diag = {}
try:
    import flopy
    exe_candidates = ['mf2005','mf2005.exe','mf2005dbl','mfnwt']
    found_exe = None
    for cand in exe_candidates:
        path = flopy.which(cand)
        if path:
            found_exe = path
            break
    modflow_diag['executable_found'] = bool(found_exe)
    modflow_diag['executable_path'] = found_exe
    if not found_exe:
        print('No MODFLOW-2005 style executable detected; skipping run.')
    else:
        print(f'Using executable: {found_exe}')
        with tempfile.TemporaryDirectory(prefix='mf2005_diag_') as ws:
            m = flopy.modflow.Modflow('diagtest', model_ws=ws, exe_name=found_exe)
            nlay, nrow, ncol = 1, 1, 10
            Lx = 100.0
            delr = Lx / ncol
            delc = 1.0
            top = 10.0
            botm = 0.0
            dis = flopy.modflow.ModflowDis(m, nlay, nrow, ncol, delr=delr, delc=delc, top=top, botm=botm)
            ibound = [[[1]*ncol]]
            # Constant heads at both ends
            ibound[0][0][0] = -1
            ibound[0][0][-1] = -1
            strt = [[ [top if j==0 else 0.0 if j==ncol-1 else top/2 for j in range(ncol)] ]]
            bas = flopy.modflow.ModflowBas(m, ibound=ibound, strt=strt)
            lpf = flopy.modflow.ModflowLpf(m, hk=10.0, vka=10.0, ipakcb=53)
            pcg = flopy.modflow.ModflowPcg(m)
            oc = flopy.modflow.ModflowOc(m)
            success, buff = m.run_model(silent=True)
            modflow_diag['run_success'] = success
            if success:
                from flopy.utils import HeadFile
                hf = HeadFile(os.path.join(ws,'diagtest.hds'))
                h = hf.get_data(kstpkper=(0,0))[0,0,:]
                modflow_diag['final_heads'] = h.tolist()
                # Analytical linear solution between 10 and 0 over length Lx
                x = [delr*(i+0.5) for i in range(ncol)]
                analytical = [10.0*(1 - (xi/Lx)) for xi in x]
                max_abs_err = max(abs(hi-ha) for hi,ha in zip(h, analytical))
                modflow_diag['max_abs_error_linear_solution'] = max_abs_err
                modflow_diag['analytical_ok'] = max_abs_err < 1e-3
                print(f'Model run OK. Max abs error vs linear solution: {max_abs_err:.2e}')
            else:
                print('Model run failed.')
except ModuleNotFoundError as e:
    modflow_diag['error'] = f'FloPy not installed: {e}'
except Exception as e:
    modflow_diag['error'] = f'Unexpected error: {e}'
diag_results['modflow'] = modflow_diag
modflow_diag

## 7. 3D Capability Check (Plotly Surface)

In [None]:
plotly_3d = {}
try:
    import numpy as np, plotly.graph_objects as go
    X, Y = np.mgrid[-2:2:30j, -2:2:30j]
    Z = np.exp(-(X**2 + Y**2))
    fig3d = go.Figure(data=[go.Surface(z=Z, x=X, y=Y, colorscale='Viridis')])
    fig3d.update_layout(title='3D Surface Test', margin=dict(l=0,r=0,b=0,t=30))
    plotly_3d['success'] = True
except Exception as e:
    plotly_3d['success'] = False
    plotly_3d['error'] = str(e)
diag_results['viz']['plotly_3d'] = plotly_3d
try: fig3d
except NameError: pass
plotly_3d

## 8. System Resource Snapshot (Optional)

In [None]:
sys_snap = {}
try:
    import psutil
    vm = psutil.virtual_memory()
    sys_snap['memory_total_GB'] = round(vm.total/1024**3,2)
    sys_snap['memory_available_GB'] = round(vm.available/1024**3,2)
    proc = psutil.Process()
    sys_snap['process_memory_MB'] = round(proc.memory_info().rss/1024**2,1)
except Exception as e:
    sys_snap['error'] = str(e)
diag_results['system'] = sys_snap
sys_snap

## 9. Aggregated Summary
Run this cell last to see a compact readiness report.

In [None]:
from pprint import pprint
summary = {}
pkg_status = diag_results['packages'].get('status', {})
missing = [p for p,s in pkg_status.items() if not s['ok']]
summary['missing_packages'] = missing
summary['modflow_executable_found'] = diag_results['modflow'].get('executable_found')
summary['modflow_run_success'] = diag_results['modflow'].get('run_success')
summary['modflow_linear_solution_ok'] = diag_results['modflow'].get('analytical_ok')
summary['geospatial_errors'] = [k for k,v in diag_results['geospatial'].items() if isinstance(v,str) and v.startswith('ERROR')]
summary['plotly_3d_success'] = diag_results['viz'].get('plotly_3d',{}).get('success')
summary['overall_ready'] = (not missing) and summary['modflow_executable_found'] and summary['modflow_run_success'] and (summary['geospatial_errors']==[])
diag_results['summary'] = summary
print('=== DIAGNOSTICS SUMMARY ===')
pprint(summary)
if not summary['overall_ready']:
    print('\nOne or more checks failed. See above cells for details.')
else:
    print('\nEnvironment appears READY for the course.')